Monday, October 26, 2009

Analysing Formula 1

It's well-known that the greatest number of Grand Prix winners in a single season is eleven, a number attained during the epic and tragic 1982 season. Did you know, however, that the second greatest number of winners in a single season is nine, from the 1975 season?

In fact, in the history of Formula 1, only six seasons have featured eight or more winners, and all but one of those occurred between 1975 and 1985, (the subsequent exception being 2003). On the basis of this fact alone, one might argue that the years between 1975 and 1985 define the most competitive era in the sport's history. It also begs all sorts of questions about the conditions which led to such a competitive environment, and why they have so rarely pertained since.

These facts are gleaned from Roger Smith's colourful 2008 tome on the statistics of Formula 1, which is now available in paperback. It's well worth a purchase, for this was clearly a labour of love for Smith. Of particular interest is the book's concluding chapter, where Smith expounds the results of a rating system which enables all the champion drivers to be ranked, irrespective of the eras in which they raced.

The basic performance indicator chosen by Smith is a driver's strike rate, the number of Grand Prix victories as a fraction of races contested. After ranking the drivers by strike rate, Smith then attempts to adjust the ranking to compensate for the superiority of the equipment at a driver's disposal, and the strength of the driving competition he faced. Sadly, Smith doesn't 'show his working' here, but he does explain that the superiority of a driver's equipment in any particular year can be estimated by factors such as: the absolute share of wins achieved by the driver's team; the number of 1-2s; the number of victories by the driver's team-mate; and the share of wins relative to the second most successful team that year. How Smith disentangles the strength of the driving competition from the strength of the equipment available to the competition is unclear, but the upshot is a rating system which places Fangio first, Clark second, and Schumacher third.

A first objection is that it's difficult to argue with Smith's reasoning without being able to see his detailed calculations. In addition, however, there is a serious omission which underlies Smith's ratings system, and it is an error which is committed by every published attempt to rank the all-time greats. It is the failure to adjust for the size of the competitive pool, and the fact that the pool has been steadily growing in size since the inception of the Formula 1 World Championship.

In the 1950s and 1960s, only a relatively small number of people were competing in single-seater motorsport, and there were only a small number of formulae. The number competing at all levels of the sport has increased massively from the 1970s and 1980s onwards. Developing hand-in-hand with this has been the proliferation of the different junior formulae, all arranged in a pyramidal structure, filtering out the best drivers at each stage (in theory!), and feeding them towards the world of Formula 1, located at the tip of the pyramid. At the base of the pyramid is the immensely competitive world of kart racing, into which thousands of children across the world every year, are now inducted at an early age, to begin learning the craft of racing driver.

As a general principle of performance statistics, all other things being equal, the best person from a large competitive pool is likely to be better than the best person from a small competitive pool. Aphoristically, it's easy to be a big fish in a small pond, and the best drivers of the 1950s and 1960s were essentially just that. The Fangios and the Clarks were the tips of very small pyramids, whilst the Sennas, Schumachers and Hamiltons are the tips of very large pyramids.

As a comparison, consider American single-seater racing. This is a much smaller competitive pool than the hierarchy of single-seater formulae in the rest of the world, which feeds into Grand Prix racing. Thus, it is easy for a driver such as Al Unser Jnr or Michael Andretti to look devastating in Indycar racing, but to fail badly when they attempt the transition to Grand Prix racing. The best Formula 1 drivers of the 1950s and 1960s are comparable to the best drivers in American single-seater racing. It's quite possible that the best driver ever could have raced in American single-seater racing, and it's still quite possible that Fangio or Clark was actually the best driver the world has ever seen, but on the basis of statistics alone, adjustment for the different sizes of the competitive pools mitigates against this conclusion.

The best drivers in the world effectively lie in the tail-end of the distribution of driver talent, and as a general statistical rule, unless you take very large sample sizes, you're unlikely to be sampling from the tail-end of a distribution. For example, if the frequency of great drivers (those in the tail-end of the talent distribution) is 1-in-100,000, then at a time when there are only, say, 100 drivers in the world, the chance that one of them will be a great driver will only be 1-in-1,000. Hence, it's highly unlikely (but not impossible), that the best driver the world has ever seen was part of the small sample of Grand Prix drivers found in the 1950s and 1960s.

All of which is maybe a way of saying that statistics alone cannot be used to support or refute the subjective appreciation of drivers, made by observers in the same era to which the drivers belong.


Patrick said...

Years ago I remember getting a fair bit of flack on 'The Nostalgia Forum' for using exactly that line of reasoning to argue that Gaston Mazzacane *might* have been a better driver than Fangio.

I suppose the counter-argument might be that the cars were intrinsically more difficult to drive back then and enabled, even forced, drivers to develop their talents to a degree that modern F1 cars simply don't require. That said, I've a suspicion that the degree to which F1 cars of the 50s and 60s were more difficult to drive has been overstated

Gordon McCabe said...

There's also a strong counter-argument that the scale of the challenge, due to the risk of death or injury, was greater in the 1950s and 1960s.

In fact, there are two competing factors at work here: whilst the size of the competitive pool has been increasing since the inception of the championship, the scale of the challenge has been decreasing.

Now, you might argue that the scale of the challenge is actually irrelevant to the issue of driver talent. The scale of the challenge makes the achievements of drivers in the 1950s and 1960s more profound, but one might argue that it doesn't change the level of talent in play. No matter how onerous the challenge posed to a small sample, that small sample is unlikely to sample the tail-end of the talent distribution.

However, this line of argument might be based upon a false understanding of what talent is. If talent is not merely something which is 'God-given' (i.e., the result of DNA), but something which can develop in response to the level of the challenge posed, then a greater level of challenge could indeed create a more talented set of drivers.

In addition, whether or not you factor the scale of the challenge into an assessment of the greatness of drivers depends upon whether you think greatness is determined by talent alone, or by a combination of talent and achievement. The greater the challenge, the greater the achievement in surmounting that challenge, hence the achievements of drivers in the 1950s and 1960s cannot be matched by drivers in the modern era.