Thursday 3 May 2012

Holism versus the model approach

I confess my previous post about recursion left me wanting. I want to explain myself more, and possible better. Better is not always more (R-squared correlation of about 0.5). Re-reading the gist of what I had, I begin with describing the heuristic versus model approach to complex system optimization. That sentence alone is what killed off many a would-be reader. Let me phrase this idea in a more user-friendly way: 'heuristics' is a scary word with a simple aim: to define and manage only those variables that produce controllable results. In the baseball-catching analogy, this would be variable 1 (V1) = gaze-angle of the ball catcher and V2 = his or her position on the field.  By contrast The 'model' method is a harmless sounding idea with complex aims: to assembel as many 'fundamental' variables as can to predict results (i.e. optimize in your favour). In the baseball-catching analogy, this would be V1 = gravity, V2 = wind speed, V3 = ball mass, V4 = ball shape, V5 = launch angle, V6 = launch speed, V7 = initial player position... 

Instinctively, which method would you choose?

Side note: I will stop using the phrase 'heuristic variable' because it sounds pretentious. From here onwards I dub 'holistic measure' (HM) and 'model variable' becomes 'model measure' (MM).

Back to the thread: I outlined how complicated and impractical it would actually be to use a model approach in running. I wrote some muddled statements about how many variables could be wrung out of a 'simple' 10x400 meter workout. I listed about ten. The first few were simple like "number of repeats", "rest time", "total mileage", "runner fatigue". My conclusion stumbled into the idea of how complex the 'real world' of running is, and that to account for all of these variables would be impossible: "Quantization of every variable using a bottom-up approach is clearly an insurmountable task." and "We have to know each athlete in individually to account for these variables, and how each variable affects the others". 

Something didn't sound right. Not that it felt wrong to say, but the conclusions were (as yet) unproductive. Pointing out complications without offering any solutions is a no-no, if avoidable. We already know the world is a complex place. We seek mental frameworks to help keep things simple (though oversimplification is equally bad if it leads to otherwise avoidable mistakes). That's why VO2 max and lactate thresholds are measured at all: scientists believe these quantities will simplify (but not oversimplify) athletic training. Coaches then get it in their mind to 'optimize' a runner using a mix of these measures for determining future success.

The next day after writing the blog entry I was practice doing a 10x500m meter workout. (Coincidence?). There I was jogging around Kent park when it finally struck me: I had mashed both kinds of variables (model and holistic) into one list without realizing it. The good news came quickly after that: I should be able to classify these variables into their respective categories, 'model-type' and 'holistic'. Without any further introduction here are model (MM) and holistic (HM) measures of 'success' filtered into their respective groupings. Here is my mental map:



Here is a legend to the chart:

Model world (MM):  

Most are self-explanatory: weight lift max is perhaps your maximum dead lift capability or your max number of push-ups. Heart rate and weight/body fat are easily obtained. 

MaxLaSS: the maximum lactate generated by an athlete at steady state. This is a sign of a good middle-distance to 10k runner, i.e. someone who does not build lactic acid throughout a race, hence can make a sprint-dash for the finish. It is less relevant as distances increase, and insignificant for sprinters (racing less than half a minute or so).

VO2 max: This is supposedly important for all 'cardio' athletes. Then again writers like Tim Noakes find it both poorly correlated with race times and difficult to alter withing a relevant time period (especially while attempting to hold all other important variables constant).

Genetics is the ultimate bottom-up measure, as you literally derive all your physical self from your DNA. Ironically mapping the genome had been the least fruitful yet by which to reveal athletic talent. An 'athlete' gene' has not been found and probably never will.

Holistic world (HM):

Race time: It's almost trivial to say, but the best predictor of future race times is a previous race time. The correlations are amazingly accurate, especially when predictions are tailored to an individual. Sadly this does not predict whether you will improve or not. That is where overall progression comes in to play (which accounts for training times, etc). Progression is hard to define as single value, or even a group.

Relative mileage: unlike absolute mileage, relative mileage is something only you and your coach know. Running 130 miles a week: is this too much or not enough? Are you running with a cold or in perfect health?, flat terrain versus hilly?, tapering versus buildup phases? young athlete versus old?, etc.

Your personality type, as far as running, is something an observant coach might know better than you do. Do you have a tendency to fade when others pass you? Do you let others start ahead because you don't like taking the lead? Do you need to be told to slow down or speed up in workouts? Are you too hard or easy on yourself after a defeat? No spreadsheet takes these into account but most people have a sense of what to expect.

Muscle burn: Although technically the same as lactic acid/proton ions concentration, the latter is normally measured only after the fact, in specific studies, not for each practice in real time. The moment you feel a burn in your muscles you know what it means. Science has helped straighten out why you feel a burn (and how long it takes to clear the body), but on any given training day only you will know precisely how far in and at what speed it strikes.

Mental state: this is the sum of what your nervous system tells you and what you tell your nervous system. Stress, hard mental exercise, personal life, post-race depression, etc all play a role. Feeling low can affect your entire state of health and well-being. Feeling optimistic can by pull you through some pretty horrific injuries. It's one of the most vital statistics a person can know, yet rarely appears anywhere on paper!

Hunger: No matter how much you think you ate calorie-wise, if you still feel hungry there's a pretty good chance your body needs more food. Sometimes your body can be wrong. But imagine if you thought you were eating enough (by your own calculations) but actually short 100 calories a day. The result would eventually be catastrophic.

Relative age: In sport you can have old 25 year olds and young 40-somethings. It depends on their years of experience. A female 27-year-old gymnast might have been competing for over 20 years. That's a career. Meanwhile senior world-record holder Ed Whitlock began his competitive career after age 40.

------------

Discussion

I am not trying to bash sports scientists use of MMs by separating their work from coaches' HMs. We best understand the world from using a combination of bottom-up and top-down methods. Clarity is the goal. Canova for instance, introduces in his book Marathon Training some fundamental human biology. This part is kept separate from the actual training advice itself. It is principally written by a second author (physiologist Enrico Arcelli).

How did I choose to separate them? I define them so:

MMs: easy to quantify and mechanically separable
HMs: hard to quantify, interconnected, and detected by sensation.

Imagine rubber bands connecting each HM variable; move one and others shift too: an easy run will only feel easy until you feel tired. HMs are elastic while MMs are more free floating; changing your stride rate and/or rep # will not change your VO2 max, max heart rate, or body weight, at least not within a short time interval (I am open to suggestions/modifications to these definitions). HMs are intuitively distinct groups but hard to quantify (like recognizing a particular face in a crowd). Because MMs look better on paper than HMs, these are the preferred units of science. In theory enough MMs will explain all the HMs, but in practical terms we may never have enough MMs to properly predict performances. This is especially true for mentally-affected groupings (stress, melancholy) given our primitive understanding of the brain. Because of this incompleteness, as race day approaches HMs grow in importance while MMs diminish.

I added question marks to the model scheme because the list is incomplete. These are the as-yet undiscovered quantifiable measures of 'success'. Recent 'novel' measures galvanic skin response (GSR) or calcaneus length (distance from ankle to Achilles tendon). No doubt there will be others.

Other categorizations might be used, such as those measures difficult to change (max/min heart rate, VO2 max, fast-twitch fibre type, genetics) and those easily altered (# of reps, calories burned, mileage). Some MMs fall in between (stride rate is a semi-conscious activity and MaxLaSS can be improved slightly with training). We might subdivide these MMs into 'external environmental' versus 'internal' variables, but the lines are blurred due to feedback cycles (which category does an efficient stride belong?). This recursion issue is why the illustrated arrows are drawn bi-directional. 
Bottom-up MMs certainly have uses: they are there to check our logic, to ensure our intuition is grounded in good science. For example MMs aid in explaining why muscles feel sore and save us from using useless remedies (no-one knows exactly why muscles feel sore after training). But can the cart be placed before the horse? Will sports science ever have enough MMs (i.e. variables we'll call V1, V2, V3,..., VN) to 'optimize' an athlete better than HMs? Steve Magness said this much:
We can’t even name the exact characteristics that make up a great runner. The traditional triumvirate of VO2max, Running Economy, and Lactate Threshold do a worse job than simply increasing the speed of the treadmill until the runner falls off. There are so many different variables that go into running success that it’s laughable to think that we’ll find one great gene to explain it all. Maybe we would if our sole trigger in evolving was to win a race, but unfortunately that wasn’t the reason.
Notice he too mixes in a holistic term (running economy) with two MMs. Here we stand on either side of a gatekeeper who assembles variables like a chef introducing ingredients to a secret recipe. The chef is never seen and refuses to reveal his secrets.  

HMs are finite in supply and can be the most efficient means to changing other HMs: Practising runs by feel, to know when a muscle 'burn' will happen, to recognize an easy run pace, etc will train you for an intuitive understanding of race pace. In theory anyone can run an interval, but only you know that pace actually means. Training for that racing 'feel' can be obtained using many different NxM interval mixes, which is why you find so many training plans using any number of interval combinations (20x200, 10x400, 2x5k, 5x1k, ladders, fartleks ...). What works for you, works.

I also wonder whether it is possible or at least practical to speak the languages of both sides simultaneously. For instance imagine running high mileage for the sole purpose of decreasing your resting heart rate, or improving your VO2 max by running until tired (out of breath).

Here's just the start of the story, and so again where I will soon end. This post perhaps clarifies the mental map what I mean what divides our methods of measure. Eternal optimists are undaunted by the task of collecting enough MMs: they try to included as many as possible to produce the desired HMs. I'll leave you with this final idea: including more fundamental-type MMs may actually lead to worse predictions! The variable collector plan (amassing all the MMs he/she can grab ahold of) could in fact be a losing battle. Below a short explanation of when to discard such impressively complicated models, and what got me started me on this train of thought. Watch for the part where worse predictions of London weather come from using more complex models: