Why your post-exercise recovery differs from the average

“],”renderIntial”:true,”wordCount”:350}”>

Imagine that you are conducting a large study on typing performance. You put thousands of people through a battery of typing tests, then you crunch the numbers. The data is clear: faster typing speed correlates with fewer typos. Therefore, you conclude, the best way to avoid making typos is to type as fast as possible.

It is easy to see that this is an erroneous conclusion. But perhaps sports scientists inadvertently make this kind of mistake all the time, according to a recent article in the International Journal of Sports Physiology and Performance by Niklas Neumann and colleagues from the University of Groningen in the Netherlands. In fact, some scientists argue that “the vast majority of social and medical science research” may be affected by this mistaken belief that group data can be applied to individuals, a phenomenon dubbed the “ergodicity problem.”

In the typing example, the problem is that better typists are both faster and less prone to typos. Thus, at the group level, high speed and low error rate are correlated. But if you test a data individual repeatedly over time, you’ll likely see the reverse pattern: higher speed leads to more errors. The group average cannot be generalized to tell you about individual results. On the other hand, rolling a dice 100 times should give you (on average) the same result as rolling 100 dice once.

In technical terms, the difference between the two situations is that dice data is ergodic, a term coined in the 1870s by Austrian physicist Ludwig Boltzmann, while typing data is non-ergodic. Ergodicity is a crucial concept in statistical mechanics, which (for example) infers the behavior of a large volume of gas from the movements of its countless individual molecules. In recent years, the concept has spread to other areas: ergodic economics, for example, recognizes the differences between 100 people making a bet with a 1% chance of going bankrupt and one person making such a bet 100 times. What appears to be a very good bet at the group level turns out to be a very bad bet for the individual.

The athletic issue that Neumann and his colleagues consider is the relationship between training load and recovery. For endurance sports, in particular, you might consider this the master key to performance. More training increases fitness, but also increases the risk of injury and burnout. Figuring out exactly how much training you can handle and how quickly you can recover from it gets you closer to the red line of maximum training. This has led to all kinds of research that attempts to quantify how different training load patterns relate to performance and injury risk.

But is the link between training load and recovery ergodic? That is, can you measure training load and subsequent recovery in a large group of people, and use those results to predict how a given individual will respond to a sequence of training and recovery sessions?

To find out, Neumann and his colleagues worked with “a major league football club in the Netherlands”, which, based on the affiliations of the authors of the article, we can assume is FC Groningen. Over two seasons, they collected daily training and recovery data from 83 members of their Under-17, Under-19 and Under-23 teams. Before each training session, players were asked to rate their perceived recovery on a scale of 6 to 20; after each session, they indicated their perceived effort during the session, again on a scale of 6 to 20, which was then multiplied by the duration of the training in minutes to obtain the total training load.

The simplest version of the training load/recovery question is: Does the total training load of a workout affect how recovered you feel before training the next day? Researchers attempt to answer this question in two different ways. In group-level analysis, you calculate an average training load for all athletes on a given day and compare it to the average recovery score for all athletes on the following day. In individual-level analysis, you instead examine each pair of training/recovery scores for a single individual over the two-year data set.

The mathematical analysis gets quite complicated, but here is the crux of the matter. Group analysis is for a single day (plus next day recovery), but you can repeat this analysis for each available training day and average the results. Similarly, the individual analysis can be repeated for each athlete and then averaged. In this way, both approaches use all available data. If they produce identical results, then the training and recovery data are ergodic, which means that we can safely apply the results of group studies to individuals. If they do not produce identical results, all bets are void.

Indeed, the group and individual analyzes produced different results. In particular, training loads varied much more for given individuals over time than they did between individuals on any given day. And the correlations between training load and recovery also didn’t match. How a group of people react to a single workout doesn’t necessarily tell you how you react to a series of workouts.

Understanding what this means in practice is tricky. In the field of medical research, some researchers have pushed back against the idea that non-ergodicity is some kind of crisis that invalidates huge swathes of existing research. Tools such as randomized, placebo-controlled trials, they say, help eliminate some of the effects of inter-human variation. In a sense, the results simply reinforce a trend that has been growing in sports science journals for at least a decade, of always reporting individual results in addition to group averages. Seeing the individual dots on a graph gives you an immediate idea of ​​whether everyone is clustered near the average response, or whether a significant number of subjects saw different or even opposite responses from the average.

A final caveat: recognizing the shortcomings of group-level research does not mean ignoring the flaws and pitfalls of self-experimentation. My impression is that for any search result that applies to 99 out of 100 people, at least ten swear they are the exception. (Make 20 if we’re talking stretches.) Meaningful individual-level data must be collected as rigorously as any randomized trial, with predefined hypotheses, placebo controls, and measurable outcomes rather than just gut feelings. . It may be true, as George Sheehan wrote, that we are each a unique experience, but it’s up to us to make sure we interpret the results correctly.


For more sweat science, join me on Twitter and Facebook, sign up for the email newsletter and check out my book Enduring: Mind, Body, and the Curiously Elastic Limits of Human Performance.

Comments are closed.