Monday, 20 October 2014

Marathons and aging

What's the best age to run a marathon? There are a surprising number of pet theories floating around, anything from young runners are taking over the marathon to older runners are becoming more dominant. To quote from Runner's World:
Kenya's Samuel Wanjiru, 21, broke more than an Olympic record with his 2:06:32 win; he crushed long–held conventional wisdom that marathon performance peaks among runners in their late 20s and early 30s. That conventional wisdom also took a beating when a 38–year–old mother with 10 marathons under her belt, Romania's Constantina Tomescu–Dita, won the women's event.
If conventional wisdom is being upended, i.e. that the age bracket of late 20s-to-early 30s are no longer when runners do their best work, it seemed prudent to find out what the numbers themselves say. Hence I compiled the men's and women's ~2400 fastest marathon times (including repeat performances by individuals) to asses whether there is any temporal trends for fast runners. I chose an arbitrary cutoff point to compare: the 20th and 21st centuries.

Quote from here. Top Race times from here.
Mean age for 1967-2000 group is 28.4. Mean age for 2001-2014 group is 28.3
Top Race times from here.
Female mean age for 1979-2000 group is 28.9. Mean age for 2001-2014 group is 28.7. 
For men, the overall mean "peak" age is 28.3 with a standard deviation of 4.0. It turns out the most competitive period, ages 24 to 32, encapsulates 73% of the top running times. That's pretty close to what we'd expect for a perfectly normal distribution, wherein two-thirds of values are within plus or minus one standard deviation from mean. We can also measure that the actual distribution has a slight positive skew of +0.55. A positive skew means a longer tail on the right side than the left, which means our data are nearly symmetric. Top runners a few year on the 'old' side perform relatively better than runners a little on the young side, but not by much. 

By that same measure of skew, looking at the two sets of top runner data from 1967-2000 and 2001-2014, the former has a skew of 0.47 and the latter's skew is 0.61. There are slightly more male older runners in the new batch of top athletes, but just barely so. For women the two skews are even more symmetric at 0.45 and 0.38, the latter implying that age is very symmetrically distributed around the peak of 28. Both skews are positive, so there has always been a slight advantage racing in your early 30s versus your early 20s. The overall sets male and female top-time distributions have means of 28.3 and 28.9 years, and the "optimal" age at which to peak for either gender hasn't budged in decades. One should also take into account that many marathons forbid participants under 18 years of age, leaning to the possibility that the slight positive skew is an artificial creation.

Does age matter? For either gender it does not appear true that especially young (18-22) or old (34-38) are relatively more represented than at earlier points in marathon history. Collectively age matters the same as it always has at the highest levels of competition. But perhaps that does not say much for individual performance potential, in the same way a class average does not determine my personal letter grade.

Let us consider whether runners in their mid-thirties have more potential than they let on. One way to do this is by comparing world record times by age with the total distribution. Since WR times can be influenced by a single individual, WR times are more telling whether we might collectively expect more runners at that age in the future to compete at a top level.

MALE World Record times -by age- from here. Top Race times from here.
Men's WR times generally they get faster from age 17 to 20, then slower from 36 onwards. But looking a little closer we do notice an interesting counterexample or two within in the dominant trend of slowing with age. For instance, WR times steadily decrease from age 32 to 36 before suddenly leaping upward again. We may expect a slight 'correction' in the near future for runners in their late thirties despite later inevitable declines. There is a good chance we will see sub 2:05 times from runners aged of 36-39, keeping in mind that presently there are almost none. 

Another interesting mismatch is that the fastest marathon was run at the age of 30, while the highest frequency of times (a.k.a. the mode) is three years earlier at age 27. We may surmise that the very best performances come after several years of world-class racing. For women's times, however, the peak WR performances at age 29 lies much closer to the mode of marathon top times (age 28).

The female WR times also show a much more symmetric U-shape than the men's, which seems to contradict claims that women in their late 30s will ever dominate younger women marathoners. (Perhaps the older women will get faster, but it seems no reason younger will not also run faster, thus leading to a null change in the overall shape of the histogram)

FEMALE World-Record times and frequency of sub-2:30 times.
Top Race times from here. World Record (WR) times -by age- from here

Finally, just for perspective, here are the world record times by age from 5 to 95 years of age:

At this full scale the best marathon performances between 18-36, male or female, collectively look constant compared with those outside this 'optimal' age bracket. And oscillations can be enormous, such as the those running in their 70s to 90s. Therefore we should consider the earlier analysis as relevant only to the very highest calibre athletes. At the recreational level there is much more dynamic and age is less a factor except at the decadal-scale (while even here we have potential to run sub 3-hour marathons right into our 50s). After all, that's what age categories were made for! 


I've been encouraged, after speaking with Alex Hutchinson and another from Robert (It's All About the Vertical), to add a finer resolution to the my Marathon Era - Frequency plot. It is now broken into quintiles and re-sampled for three new elite marathon cutoff times:

Importantly it has been pointed out, correctly, that the very, very best in the world have gotten younger. Known as the 'Wanjiru Effect', whereby after the 2008 Beijing Games those winning or placing in top marathons were getting younger. You can see this effect in the 2009-2011 & 2012-2014 spans: both means decrease as the maximum cutoff time decreases from 2:10 to 2:06. Looking at sub 2:06 times in the past five years we see a 3-year decrease in the mean age from 29.7 (1999-2004) to 26.9 (2012-2014). And when summing the five eras together, the most frequent age for sub 2:10, 2:08, and 2:06 performances, is 27, 26, and 25, respectively. The respective combined means also decrease from 29.1 (sub 2:10), 28.4 (sub 2:08), to 27.2 (sub 2:06).

To be sure, let's add one more plot: the sub 2:04, which must further support the growing trend for younger runners, right?

And yet the sub 2:04 crowd (approaching the limits of what a histogram can meaningfully display) shows a re-emergent dominance among men in their late 20s. NB: the single point at age 35 is of course Halie Gebrselassie, who ran 2:03:59 in Berlin.

In one way my original conclusion -that age distributions among elites have remained constant- was correct, but in another way it was wrong. It depends whether you are looking at elite runners (sub 2:10 or 2:11), which is dominated by men in their late 20s, or the superelite (sub 2:06), with more men in their mid 20s, or the ultraelite, which are back to being in their late 20s. I would say the future trend is to see more winning marathoners becoming younger by several years. But the combined averages among the sub 2:10 crowd have not changed much in their age distribution over different eras, having remained close to the 28-30 age bracket. So there you have it.


  1. Hi Graydon,

    Thanks for the nice analysis and post, I found your results interesting enough to incite some additional analysis:

    1. Cool! In turn, your results inspired me to take another look at my own data, hence the new 'Update' section.

  2. This comment has been removed by the author.

  3. Hi Graydon,

    Great! Those peaks centered around 25 and 23 years in the sub-2:06 population is likely where the next WR holder will come from. The question that arises is will be the rate of progression for each of these young, "super-elite" athletes? If they quickly hit an exponential plateau, like many do, then they will not make it to the ethereal sub-2:04 population. However, should one or more demonstrate a progression with no plateau (linear) or a mild one for long enough then we may have some excitement!

    thanks again for your post- the subject is very interesting.

    1. Thanks again Robert for getting me to look further into this. Since you comment I added one final plot: the sub 2:04 histogram. Not many people in that one. And if you're interested, I just posted my own take on the sub 2hour marathon might be possible

    2. Hi Graydon,

      I like the sub-2:04 histogram as, although any conclusions from the data will come with a large error probability, the shape supports the observation of exponential plateaus seen in running times for many individuals, elite and otherwise. Perhaps these young and fast marathoners have a steep exponential improvement arc and then a long, shallow improvement path thereafter. I am still going about analyzing performance progressions but the data are difficult to obtain in entirety for individuals.

      Still thinking about your nice post on the sub-2h marathon.