Using statistics to find the most similar players to Lionel Messi around the world
There is only one Lionel Messi, but with data, different players can be identified who perform similar to the Argentine. Let us dive into one methodology for finding these players and examine some of the results.
The method used for this analysis is known as factor analysis. Essentially, it allows someone to start with several “base-level” metrics – shots, dribbles, tackles – and it looks to explain the variance found in these numbers. Using the relationships between the different metrics, it “simplifies” the data into fewer categories known as factors, that explain as much of the variance as possible.
Say the input consists of 20 different metrics; the output will have been cut down to maybe five or six factors. How a player performs with respect to each stat has either positive, negative, or no correlation to their score in each factor. This correlation will be further examined in each of the factors later.
The leagues included in this process were the first divisions of the top 20 nations in the UEFA club coefficient rankings for 2019/20, the second divisions of the top five leagues (Spain, England, Germany, Italy, and France), and the first divisions of the United States, Mexico, Argentina, Brazil, Uruguay, Colombia, and Chile.
To qualify, a player must have played at least 1,000 minutes in one of these leagues in the 2019/20 season, and they must have had centre forward, winger, or attacking midfielder listed as their primary position for the season. This started off as a pool of 2,600.
Here are all of the statistics included in the factor analysis:
Note: All statistics were measured per 90 minutes of playing time, and they were all provided by Wyscout.
Factor analysis results
From those 22 initial metrics, the analysis output five factors which were nicknamed based on what they seem to reflect. By running through each of them, we gain a better understanding of what type of player they portray. The factors will be presented in order from most explanatory to least explanatory, meaning the first factor was responsible for the greatest portion of the explanation of the variance in the data.
That first factor has been named “Drops Deep In Possession”. The metrics with the strongest positive correlation (having more of these causes a player to score higher in the factor) were passes, passes into the final third, and progressive passes.
Those with the strongest negative correlation (having more of these causes a player to score lower) were their rate of received passes that were long passes, aerial duels, and expected goals per shot.
The attackers who rank highly here are those who drop off and are very involved in progressing the ball to attacking areas, while not getting in behind for long balls or competing in the air often. This tidy, possession-based style is reflected with the likes of Hakim Ziyech, Lionel Messi, Isco, and Bernardo Silva, ranking above the 90th percentile.
Up next is “Dribbler”, where it is important to have multiple dribbles, progressive runs, and crosses, in addition to having low aerial duels, expected goals per shot, and rate of long passes received. The high scorers are once again, the ‘technical’ players, and in this case, those who like to pick up the ball and drive forward at the opposition defence.
As a result, there are many explosive attackers above the 90th percentile, including Jeremie Boga, Youcef Atal, Adama Traoré, Neymar, and Wilfried Zaha. While these first two factors have more reflected actions that work to progress the ball into dangerous areas, the next one is more about players who get into those dangerous positions themselves.
That is because factor three, “Box Presence”, reflects a high number of touches in the penalty area, deep completions, and shots. The strongest negative correlations are to defensive duels, possession-adjusted interceptions, and average pass length.
Those who score the highest here – the likes of Messi himself, along with Kylian Mbappé, Mohamed Salah, Zlatan Ibrahimović, and Odsonne Edouard – are not afraid to pull the trigger, possess strong poacher instincts, and tend not to drop back as much defensively.
Up next is “Direct Progressor”, which highlights those who almost always prefer taking a peril in a high-risk, high-reward situation. The metrics with the strongest positive correlation are the rate of passes played forwards, passes into the final third, and the rate of passes played long.
Conversely, having a higher rate of passes played backwards, touches in the penalty area, and expected goals per shot are punished. With this in mind, it should be no surprise to see the likes of Ziyech, Bruno Fernandes, Mohamed Ihattaren, Neymar, and Calvin Stengs putting up high scores.
Lastly, in “Wide Creator”, there is a determinant that mainly displays more traditional wingers. For this final factor, it is important to perform well in crosses, passes into the penalty area, and expected assists (xA), while having a low backwards passing rate, aerial duels, and expected goals per shot.
Once again, the top performers throw out some of the usual suspects for this style, such as Karim Bellarabi, Ivan Perišić, Kingsley Coman, José Callejón, and Traoré. With all five factors established, time to get into what they say about Messi.
How Messi performs
Now, how does the data reflect someone like Messi, whose skillset transcends beyond any single style? The six-time Ballon d’Or winner will drop to midfield to pick up the ball, and pick out a long through pass or weave through defenders.
Then, on top of that, Leo gets into the box, has a high shot volume, and is among the most lethal finishers of all time. His percentile ranks for the factors back up his well-roundedness.
Unsurprisingly, his scores were above the 98th percentile for four of the five factors. Only in Wide Creator (12.9th percentile) was the Argentine below average, reflecting his tendency to cut inside and drift into central positions. This is a testament to not only the diverse skill set he possesses but also the sheer amount of work he performs in the attack.
What does this mean for identifying those similar to Messi? In short, they have to operate in central areas primarily and have to be incredibly well rounded. This is certainly not easy to match, but if it were, Leo would not be so special.
Now for the ultimate goal of the process: picking out players who perform similarly. For the Argentinian forward’s top match in a certain league, a player must have played in that league during the 2019/20 season – when the stats were collected from. Without further ado, time to jump in.
La Liga: Martin Ødegaard
The Norwegian may lack some of the agility and end product of Messi. Still, after the starlet’s performance for Real Sociedad last season, it is hard to come up with another player in the league who more closely lives up to the comparison. Interestingly, though, now back at Real Madrid, his brilliant left foot and vision will be on display for Barcelona’s archrivals.
Premier League: Jack Grealish
Just beating out Riyad Mahrez of Manchester City was England international Jack Grealish. Similar to Messi, Grealish sports the number ten, is the captain of his club, and the source of the majority of their attacking creativity.
Bundesliga: Philippe Coutinho
Now a teammate of Messi’s once again, the reigning treble winner takes the top spot for the 2019/20 Bundesliga. Many attribute his lack of success during his initial spell at Barça to the similarity between the two, as they looked to operate in similar spaces and perform similar actions.
Coutinho has started off this campaign looking sharper, and hopefully, he can keep easing the burden on Messi instead of clashing with him.
Serie A: Josip Iličić
At six feet, three inches tall, Iličić stands far removed from the reigning Ballon d’Or winner in terms of height, but those who have tracked Atalanta’s incredible rise in recent seasons know the quality in the Slovenian’s boots. Like Messi, Atalanta’s #72 may have only a few seasons left in his career, but he is definitely making the most of it.
Ligue 1: Neymar
Messi’s former partner in crime has evolved into his closest match in the whole dataset. Since joining Paris Saint-Germain and becoming the main man, the Brazilian has taken on far more responsibility in the attack.
In doing so, he has become undoubtedly the closest thing to Leo in football today, and arguably the most entertaining player in Europe.
Top U-23 Match (All Leagues): Calvin Stengs
For those who are not already aware of the AZ Alkmaar wonderkid, it will only be a matter of time before they recognize his talent. The Dutchman has already been linked to Barcelona, and if his countryman, Ronald Koeman, remains at the helm, that could definitely be a potential move for Stengs down the line.
Other Leagues With a Player Within Top 100 Matches:
– Portuguese Primeira Liga: Marcus Edwards (Vitória de Guimarães)
– Russian Premier League: Anton Miranchuk (Lokomotiv Moscow)
– Ukrainian Premier League: Taison (Shakhtar Donetsk)
– Eredivisie: Oussama Idrissi (AZ Alkmaar, now Sevilla)
– Turkish Süper Lig: Adem Ljajić (Beşiktaş)
– Austrian Bundesliga: Hwang Hee-chan (Red Bull Salzburg, now RB Leipzig)
– Danish Superliga: Mikkel Damsgaard (Nordsjælland, now Sampdoria)
– Scottish Premier League: Odsonne Edouard (Celtic)
– Czech First League: Vladimir Jovović (Jablonec)
– Cypriot First Division: Michael Ortega (AC Omonia)
– Serbian SuperLiga: Nikola Čumić (Radnički Niš, now Sporting Gijón)
– Spanish Segunda División: Samuel Sáiz (Girona)
– EFL Championship: Saïd Benrahma (Brentford, now West Ham United)
– German 2. Bundesliga: Mats Møller Dæhli (FC St. Pauli, now K.R.C. Genk)
– Italian Serie B: José Machín (Pescara, now A.C. Monza)
– Major League Soccer: Ilsinho (Philadelphia Union)
– Liga MX: Fabián Castillo (Querétaro, now Club Tijuana)
– Argentine Primera División: Eduardo Salvio (Boca Juniors)
– Campeonato Brasileiro Série A: Giorgian De Arrascaeta (Flamengo)
Giorgian De Arrascaeta.
– Uruguayan Primera División: Juan Ángel Albín (Rampla Juniors, now Defensor Sporting)
– Colombian Categoría Primera A: Deiner Quiñónez (Independiente, now Atlético Nacional)
– Chilean Primera División: Cristóbal Jorquera (Palestino, now Fatih Karagümrük)
So, what does this all mean in the end? For one, it is a quantifiable method of displaying just how ridiculously skilled and special Messi is even though he is ageing. Just consider some of those aforementioned players.
There are more-traditional tens like Ødegaard and Grealish, some inverted wingers like Stengs, and even a more-traditional number nine popping up in Odsonne Edouard. Little to no other footballers have a skillset like Messi’s where such a diverse set of players could match them in certain bits and pieces of their play.
Then there is the player identification side of things. While this model is still relatively basic – it only uses event statistics and does not take into account aspects like positioning, physical metrics, and advanced metrics such as Expected Threat – it still does a solid job at pointing us in the right direction.
It denotes the likes of Neymar and Stengs, who have been the subjects of heavy links to Barça in the past. It can also bring up names like Saïd Benrahma and Ezequiel Barco (Atlanta United, 22nd closest overall match), who are young, exciting playmakers to keep an eye on. With the Wyscout data, the biggest value may even come with including those very obscure players that can be uncovered.
Neymar is the closest statistical match to Lionel Messi in world football; does not come as a shock, to be fair. (Photo by Laurence Griffiths/Getty Images)
Ultimately, one also has to remember that no two footballers are ever a carbon copy of one another, and this is especially true in the case of Messi. However, for identifying potential replacements, backups, or even simply new players to watch, the use of statistics partnered with traditional scouting serves as an excellent tool.