I have been recently following two major volleyball events: the men’s and women’s 2022 FIVB Volleyball World Championship.
I got interested in the different physical profile of the teams. The biotype of the players of the Japanese Women’s team seemed completely different from the others, nevertheless they are quite successful. Interested in taking a deeper look into it and looking at how the height correlates with how successful a volleyball team is (and if there is any correlation at all), I decided to check the numbers by myself.
This post is an adapted version of a Jupyter notebook that includes all data scrapping, treatment and plotting scripts on Python that you may find on my github.
Before moving on, some considerations:
- You need to know a bit of volleyball to understand part of the discussion, but just the basics such as the different positions and the category of points (I’m far from knowing more than that myself). So, proceed at your own risk if you are a complete noob! On the other way, if you are a big fan of volleyball everything I say may be too obvious, so take care also.
- Hundreds, if not thousands, of parameters and statistics are not considered in this analysis since they were not easily available.
- Finally, this is a limited correlation analysis done for fun. Even if I try to have some method and scientific reasoning, my conclusions should be taken with a grain of salt. Ask a real data analyst that knows volleyball for a serious study.
The database, obtained from the FIVB website, has 336 players from 24 countries. Players’ data include name, number, height, weight, average and total number of scored points per category (attack, block and serve) and the corresponding success rates and the position (Setter, Outside Hitter, Opposite spiker, Middle blocker and Libero).
Position of Gabriela Orvosova, from Czech Republic, is listed as “Universal”. Not knowing what that is and not being able to find any more info on that matter, I have manually overwritten that by “Opposite spiker”, as stated on her profile in the European Volleyball Confederation (CEV) website (accessed on November/2022).
My initial question
To check the height influence, first let’s consider the average and median height of the players on overall and by team:
Average player height: 182.780 cm
Median player height: 184.0 cm
|team||Italy||Germany||Bulgaria||Serbia||Croatia||Puerto Rico||Poland||Dominican Republic||Canada||Netherlands||United States||China|
The Japanese team is not the shortest in terms of average but is not very far from it: with 175.71 cm (5’9″) the average Japanese teammate is only 2 millimeters taller than the average Thai player (175.57 cm). For the median, the nipponic team is by far the shortest, with 176.5 cm.
On the opposite side of the scale, the Chinese players are the tallest for both metrics: average of 188.82 cm (6’2″) and median of 190.5 cm (6’3″). The overall average player is about 183 cm (6′) tall.
The boxplot of all the players’ height by nationality illustrates well the offset that exists between the participating teams:
To check if there is any relationship between the players’ height and their teams’ performance, lets plot how the median height evolves with the ranking at the end of the competition:
A first visual analysis indicates that there is only a small and noisy decreasing relationship between the final rank and the median height, even though the presence of a clear staircase on the podium (Serbians taller than Brazilians, Brazilians taller than Italians). Funnily enough the tallest team (China) is right next to the shortest one (Japan).
We can go a bit further by looking at the correlation and determination coefficients:
\(r = -0.3532 \quad \rho = -0.4017 \quad R^2 = 0.1248\)
The small values of the coefficients reinforce the conclusion of the visual analysis, confirming that the correlation between the height and the final standing is poor. Such result is expected considering a relatively short competition (champion participate in only 12 matches) of a game that is played point-by-point and can be decided by small gestures such as an accidental net or ball touch.
To move forward we evaluate the relationship between the height and the players’ individual performance during the championship.
Before going to the comparison of the height, we check the influence of the player position into the number of scored points:
The nature of the different positions is quite clear: while liberos scored no points at all (they are not allowed to block, serve and attack the ball), opposite and hitters are the most effective pointwise. Observed tendency justifies removing liberos on further analysis since they score no points and are way shorter – I discuss that later. Also, since some players had more chances to score than others with the knockout phase, only the average points per match are considered.
On a side note, we can highlight the amazing performances of Paola Egonu (Italy) with a total of 275 points and Britt Herbots (Belgium) with an average of 24.67 points per match!
Such as the players position, it is prudent to consider the different nature of points – from attack, block or serve – in following analysis:
For all categories of point there is a small but noticeable trend: the taller you are, more points were scored on general. This is especially pronounced for block points, as expected. Similar results are obtained when the success rate of attack, block and serve are analyzed (not shown).
However, the large dispersion around the regression lines and the small correlation and determination coefficients (\(R^2 < 0.2\)) indicate that the relationship is quite thin. The heights of the best scorers per category, highlighted in the graphs, reinforce this conclusion:
|total||Herbots Britt||BEL||Outside Hitter||182|
|attack||Herbots Britt||BEL||Outside Hitter||182|
|block||Da Silva Ana Carolina||BRA||Middle blocker||183|
|serve||Lee Seonwoo||KOR||Outside Hitter||183|
All best scorers in terms of average points are actually bellow the overall median of 184 cm. Even for block points, Brazilian Ana Carolina is “only” 1.83 m. The same for Britt Herbots, the best scorer on average considering the total points, with “miserable” 1.82 m (5’11”).
An analysis of the scored points profile by team seems to also be interesting, since the height of the teammates can influence the tactics and help to further clarify how the Japanese team got so far while being the smallest.
So, we sum all the points of each player and calculate and plot the points ratios by category:
We can see that the points are mostly from attacks (average of 80%) for all teams, but the ratios vary significantly among the nations. Japan, South Korea and Thailand, for example, perform relatively less when blocking. Let’s see how the Japanese scoring profile compare to the championship average:
The Japanese team scoring ratio is close to the highest when considering the attack points and is almost the lowest for blocking, while being on average in terms of serve points.
A direct look into the evolution of scoring ratios versus the teams’ median height can unravel a bit more the influence of the height into the team scoring performance:
There is no distinct relationship between the serve points ratio and team height, similar to the individual serve performance illustrated earlier. While correlation is small and of poor confidence due to reduced number of datapoints, only 24 teams and ratios representing a total of 52 matches, two trends are clear (\(R^2 \approx 0.35\)):
- the tallest teams score relatively more by blocking than short teams and vice-versa.
- the tallest teams score relatively less by attacking than short teams and vice-versa.
The Japanese team is quite representative of noted tendencies once it is simultaneously close to both extrema, as stated earlier.
My original queries were if the Japanese team was effectively shorter than others and how the players’ height influenced the performance. The first point was rapidly answered (yes, they were) while the second point is a bit more complicated.
The competition final ranking and the individual player’s performance (average scored points) are only slightly related to the height of the players. The profile of the scored points (attack, block or serve) is, on the other hand, clearly correlated to the players median height where the shorter teams score relatively way less by blocking. As I have noticed watching the matches, but have no data to support that claim, the nipponic players, either naturally or by design, are really good at receiving the ball from adversaries and score relatively less from blocks than others.
This analysis is very limited by only considering the scoring as a performance criterium. In a crude comparison, if you are more into football like me, it would be like measuring the goalkeeper performance by the number of scored goals. Also, it would be fundamental to check how each team lost points to their opponents as well. Finally, extend this investigation to a larger database, composed by more than one competition, would improve the quality of the study considerably.
I admit the starting premise was silly, more of an excuse to rescue some of my knowledge in statistics from the bottom drawer and exercise my skills in Pandas, Selenium and Jupyter.
My original question is already answered, but since we have a nice database available, why not move further?
Since we are here, we can look at the smallest and tallest players in absolute terms:
|Kundu Agripina Khayesi||Libero||Kenya||155|
|Hernandez Alba||Middle blocker||Puerto Rico||207|
As expected, the smallest player is a libero: Khayesi Kundu Agripina, from Kenya, with 1.55 m (5’1″). The tallest player is Alba Hernandez, a 2.07 meters (6’10”) tall blocker from Puerto Rico. The discrepancy between liberos and blockers is quite clear when the distribution by position is plotted:
But how do they compare to the general population?
Looking at the complete distribution
We may compare the height distribution of the complete dataset with the overall population.
Many aspects of human height are discussed in a very nice article by Our World in Data , including height distribution. For my analysis I retake their reference that discuss the global height distribution of men and women, Jelenkovic et al. (2016) , and reproduce their disclaimer that results are not globally representative since it does not include all world regions due to data availability.
Players’ height is pretty close to a normal distribution. One could expect the distribution not to be that close to normal considering that national teams are by far not randomly selected individuals, but several factors may be considered. For instance, the relatively diversity of the positions, as discussed earlier, the health status of all available players at the time of the event and, most importantly, their technical skills, what may not be associated with their height at all!
Another interesting aspect is that the volleyball players are not only taller than the average women but also taller than the average men.
 Max Roser, Cameron Appel and Hannah Ritchie (2013) – “Human Height”. Published online at OurWorldInData.org. Retrieved from: ourworldindata.org/human-height
 Jelenkovic, A. et al. (2016). Genetic and environmental influences on adult human height across birth cohorts from 1886 to 1994. Elife, 5, e20320. doi.org/10.7554/eLife.20320