Mark Jarvis
- Jun 16, 2023

Selection Pressure: Evolution in Roster Construction and Player Outcomes (Part 3)

Due to the length of the article it has been broken into three separate posts. You can find part 1 here and part 2 here.

Table of Contents

Part 1

1. A Few Thoughts

An introduction.

2. The Most Boring Game

Using a hypothetical league’s field dimensions and player archetypes to illustrate how the environment shapes those within a system.

3. Less Boring Games

Real life examples of different leagues and the subsequent changes in the game that arise from their varying rules.

4. Changing Chess Boards

Illustrating the importance of field dimensions by showing changes in a chess game if the board were to change.

5. What is Evolution?

A brief definition of evolution and its application to the rest of the article.

6. Fitness and Environmental Pressures

Defining fitness in the evolutionary sense through beetles.

7. Beaks and Population Shifts

Showcasing types of selection through the evolution of a finch population in the Galapagos.

8. The NFL’s Archean Eon

Reflections on the league's history, both in its rules and players.

9. The Small Supply of Big People

On the implications of a scarce supply of players with certain physical attributes.

Part 2

10. Rat Islands and the Species-Area Relationship

What island biogeography tells us about diversity and niches.

11. 53 Forms Most Beautiful

On adaptive radiation and the development of new species.

12. By Generation Not Time

Viewing evolution through the lens of generations and mutations rather than by days, months, or years.

13. Replaying the Tape of Time

How small differences in the past can create vastly different outcomes in the present.

14. Potters and Pigeon Fanciers

Artificial selection and its comparability to team selection processes.

15. Marble Racing, the Matthew Effect, and Genetic Drift

How cumulative advantage and randomness influence the population of players.

Part 3

16. Correlated Traits and Spandrels

On measurements that relate to each other and their value in success.

17. Fitness Landscapes

Using the metaphor of fitness landscapes to map player populations.

18. Going the Way of the Dodo

On the extinction of certain types of players.

19. Peacocks, Oxpeckers, and the El Farol Bar

The league as a complex adaptive system and the difficulty of understanding the whole through its parts.

20. Application and Closing Thoughts

Putting these ideas into practice and reflection

Correlated Traits and Spandrels

Spandrels on the Thomas Jefferson Building

On many older buildings with a classical form, space is treated as a canvas to be filled by the master. Even for those who lack a true craftsman to sculpt beautiful figures or patterns, there is still an emphasis on occupying the available space with something that is aesthetically pleasing. These are spandrels, a way of bridging the sides and tops of an arch.

Are they there because the building was constructed to place a beautiful design in that location, or are they there because there was empty space and something could or had to be placed there? This is the question that struck Stephen Jay Gould in regards to viewing all evolution as strictly adaptationist (as in, everything is an adaptation).

In a pure adaptationist sense, someone could look at those beautiful designs and reason backwards towards some meaningful reasoning for the spandrel. “It is a necessity of the building to have this aesthetic! The building simply wouldn’t be complete without it!”

But this type of backwards reasoning is subject to all sorts of limitations, and in the case of spandrels it would be incorrect. The building was not designed by an architect with small adornments along the arches in mind, but rather the adornments simply came to occupy a given area. Here’s a quote from Gould in his paper The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Programme. Bolded for emphasis.

"The design is so elaborate, harmonious, and purposeful that we are tempted to view it as the starting point of any analysis, as the cause in some sense of the surrounding architecture. But this would invert the proper path of analysis. The system begins with an architectural constraint: the necessary four spandrels and their tapering triangular form. They provide a space in which the mosaicists worked; they set the quadripartite symmetry of the dome above.

Such architectural constraints abound, and we find them easy to understand because we do not impose our biological biases upon them. Every fan-vaulted ceiling must have a series of open spaces along the midline of the vault, where the sides of the fans intersect between the pillars. Since the spaces must exist, they are often used for ingenious ornamental effect. In King's College Chapel in Cambridge, for example, the spaces contain bosses alternately embellished with the Tudor rose and portcullis. In a sense, this design represents an "adaptation," but the architectural constraint is clearly primary. The spaces arise as a necessary by-product of fan vaulting; their appropriate use is a secondary effect. Anyone who tried to argue that the structure exists be-cause the alternation of rose and portcullis makes so much sense in a Tudor chapel would be inviting the same ridicule that Voltaire heaped on Dr. Pangloss: "Things cannot be other than they are... Everything is made for the best purpose. Our noses were made to carry spectacles, so we have spectacles. Legs were clearly intended for breeches, and we wear them." Yet evolutionary biologists, in their tendency to focus exclusively on immediate adaptation to local conditions, do tend to ignore architectural constraints and perform just such an inversion of explanation."

Stanford professor Robert Sapolsky does an excellent job of illustrating the concept as well, perhaps in a way that will make more sense than in the prose of Gould.

In the simplest terms, the idea is that selection pressures do not drive every single feature or trait to be an adaptation, but rather that some features and traits are simply carried along by other more key traits. In a population of a finite number of players with certain structural limitations imposed by human biology, is it unfair to expect similar spandrels to be carried over during the selection process of players?

When I first came across this idea, the player that came to my mind was recent fifth round pick Clemson offensive tackle/guard Jordan McFadden, who falls far below height standards at 6’2 ¼” (6022) but has surprising length for his body type with 34” (3400) arms. McFadden’s resume and other measurements except height all seem to pass the offensive tackle test. He was a three year starter at tackle for a top school. He has a basketball background. He ran a 4.99 in the 40-yard dash, although his other testing numbers were middle of the pack. Although he is on the lighter side for an offensive lineman, the one glaring question for his projection is being 1st percentile tall for an offensive tackle.

Jordan McFadden RAS, courtesy of Kent Lee Platte’s RAS scores.

Look at almost any scouting report from internet draftniks and you’ll see the mention of potentially moving inside to guard. Perhaps the size is good reasoning to move him to the interior. Perhaps there are elements to his game that are affected by his body type (and the body types of shorter linemen in general) that make it harder to succeed in that role. It may be that shorter legs or a squatty frame make it harder to get depth and preserve the pocket.

But how closely correlated is height to length, and are we substituting the importance of one attribute (length) by looking at another one (height) that usually goes along with its success?

I took the 289 offensive linemen from the 2022 draft class and calculated the correlation coefficient between height and arm length to see how closely correlated the measurements are. If I were to end up with r = 1, it’d mean that the shortest player has the shortest arms, the tallest the longest, and everything in between is in a perfectly linear order for those two measurements. r = -1 would be the exact opposite, with the shortest players having the longest arms and the tallest having the shortest. r = 0 would mean they are not correlated at all and have no apparent relationship.

I ended up with r = 0.5835, suggesting a moderate correlation between the height and length of offensive linemen, at least for the 2022 class. That is to say that, as expected, there is some relationship between the height of an individual and their ability to reach the top shelf in the grocery store, or rather the chest of an opposing lineman.

Let’s sub out McFadden for a made-up player here to run a thought experiment. Suppose we were to take a player who is a prototypical left tackle in almost every way. He is a great athlete, he has 34” arms, he is 315 pounds and plays like it, and his technique is as crisp as it comes for a college offensive tackle. We make two copies of him and change one thing.

One version of him is 6’5”. The other version of him is 6’2”. Which one are you going to pick if forced to choose between them?

Why did you choose what you chose? Is there any real difference between the “fitness” of the 6’2” version and the 6’5” version when everything else stays equal? Intuitively, I leaned towards the 6’5” version when I ran this thought experiment. I don’t know why. Perhaps it is the deeply ingrained “prototype” of the tall and long offensive tackle who looks the part of the many successful tackles in the past. Even if the shorter version were able to attain better leverage or something along those lines I would still feel like I’m poking the bear by selecting a short offensive tackle when so few exist within the system.

If you were to ask me which matters more, height or length, I would answer length without any pause. It is the more functionally important of the two, in my opinion. If you were to give me a 6’8” lineman with short arms or a 6’2” lineman with long arms I would take the 6’2” lineman every time. But there’s something that seems risk seeking or dangerous about taking a short offensive tackle or another outlier of the sort, even if careful reflection doesn’t suggest an impact on their overall fitness.

Perhaps a more concrete example of where these spandrels could mislead us would be to look at quarterback height again. Let’s assume that our goal is to select quarterbacks who see the field well (because seeing the field helps to make better decisions, improves accuracy, etc). How closely correlated is seeing the field well to the height of a quarterback? Intuitively it makes sense that a taller quarterback would have a better field of vision than a shorter quarterback, but when we rely on this intuition we may be substituting a more important question (how well does this quarterback see the field?) with a different one that is correlated but not an exact match (how tall is this quarterback?)

Perhaps the solution is no more complex than simply being deliberate about what we are selecting for and why, but understanding how closely related measurements and traits are to each other could be useful for getting a better picture of the situation. If we don’t know how they relate to each other we run the risk of perceiving spandrels as core structural elements rather than selection hitchhikers.

Fitness Landscapes

In 1932, geneticist Sewall Wright had a brilliant idea for how to show the impact of two different genotypes or phenotypes on the relative fitness of any given population within an environment. He created a fitness landscape - a map with peaks and valleys indicating how relatively well adapted any individual would be to their environment based on those two different dimensions. You can visualize this as a topographical map with each axis indicating something, for instance size and speed. Perhaps in one niche speed is prioritized while another niche would lead to prioritizing size, both of which would lead to vastly different fitness landscapes.

Wright’s original fitness landscape, with the + and - indicating peaks and valleys, along with a modern fitness landscape with a heat map indicating fitness.

Here is an example of two traits mapped onto this fitness landscape, along with the fitness peak’s spot between the two of them. The less of trait 1 and trait 2, the worse any individual would perform. A strong dose of both would be better, although a fitness peak may not always be the maximum of both traits. For instance, we may think that more size and speed is always a good thing, but that may not capture the risk of injury or something along those lines. A cheetah that can run 150 MPH may be a remarkable sight, but it would break its legs and die at the first attempt to change direction.

Remember that we are only visualizing two dimensions among the many hundreds of dimensions we could choose, which creates some limitations.

A non-football example of this that may make more sense than the vague fitness landscape maps shown above would be something like the fitness of any given driving route I could take to work. I have three possible routes to work (at least ones that aren’t plain wasteful on gas), so let’s map them on the fitness landscape.

Suppose my goal (what I’m selecting for) is getting to work as quickly as possible, as being slow will lead to being late for work. I am going to leave at 6:20 AM each day and have to be at work

by 7:00 AM. Let’s suppose I am going to map my fitness like this - a label of 1 as my fitness grade would mean I always get to work on time no matter what. A label of 0 would mean that I never get to work on time no matter what. Anything in between there is some variation of being habitually late or steadily on time. Let’s assume 0.9 means getting there 90% of the time, 0.8 is 80%, etc.

This is our fitness map. I will never be guaranteed to always be on time if I leave at 6:20 AM, as wrecks, trains, and other sorts of unexpected occurrences can ruin my perfect attendance. However, there is a clear path for how to maximize my odds of getting to work on time. A higher speed limit means that I can go faster and going faster means a better chance of getting there on time. Phrased differently, a higher speed limit means a higher fitness. On the other hand, increased traffic leads to a decrease in fitness. It doesn’t matter if the speed limit is 10 MPH or 100 MPH, if I’m stuck in a traffic standstill I’m probably going to be late for work either way.

I can then take different pathways to work and represent their fitness on this map. To achieve the absolute maximum fitness (top left - 0.9), I would need to basically have an unoccupied highway straight from my house to work. Because of the way the system is built I have no way of achieving that. Instead I will have to settle between a few different possible routes. Here they are.

Route A: This route is taking the backroads. The lower speed limits lead to being stuck in a 45 MPH cruise rather than reaching 55 or 60 MPH, but the scarce traffic helps to offset that and still lead to a decent fitness score.

Route B: This route is taking the main road, which has a few long stretches that allow for getting up to 60 MPH and covering a lot of ground in a hurry. The slightly higher traffic, particularly with the workforce of a local brewery coming through, leads to the same relative fitness as route A.

Route C: This route is taking a path downtown, which has some of the lowest speeds and highest traffic of any possible route. While there is a stretch that gets onto the highway, the majority of the time spent in town leads to consistently being late when leaving at 6:20 AM. This is clearly the least appealing of the options (and I haven’t taken this route in months).

Personally, I like taking route B because the traffic is tolerable and generally everyone is in a slight hurry, so the speed is a little bit faster than the limit most of the time. But most of the time, route A and route B would provide similar results.

This idea of fitness landscapes can be mapped to many things, but to take them away from an obscure driving example let’s shift back to the usage of them in understanding populations. Let’s take a look at the 2022 wide receiver crop in two dimensions - speed (40-yard dash) and size (weight) to try to figure out the curves of the fitness map. Green dots indicate a player who was drafted, yellow were UDFAs, orange were tryouts, and reds did not receive an NFL opportunity.

It’s a pretty common sense result and one that might lead someone to ask “why do these fitness curves matter at all?” which is a pretty fair question. If we wanted to map the fitness of players, we’d draw a curve somewhere around the sub 4.4 range to indicate that the most fit players are incredibly fast and that weight isn’t as big of a deal. We’d probably draw another curve slightly above that range indicating the rest of the draftables, mostly between 170 and 225 pounds and sub 4.5 in the 40-yard dash. Another curve would catch our UDFAs and tryouts who share similar weights but are slightly slower. Our last curve would cover our rejects, who are too deficient in speed. Our fitness map could go something along these lines, with the red peak being the highest chance of being drafted and everything below gradually fading away from that.

There are two concepts that make understanding fitness landscapes useful beyond simply using the map metaphor for studying the distribution of a population relative to two traits. Firstly, the idea of dynamic or shifting fitness landscapes helps us to look at the challenge as one of pinning the tail on the metaphorical fitness donkey as it meanders around to new optimums. Secondly, the idea of local and global optimums helps us to ask whether or not we are aiming for a target that is truly the best thing within the system.

Let’s take a look at the dynamic or shifting landscape example. You’ve seen my preferences for taking different routes to work, but those preferences aren’t always static. When the winter weather is really bad, the backroads become too dangerous to warrant driving on. Route A goes from an equally suitable choice as route B to being a slow crawl along curvy hills that could easily lead to a ditch if I don’t take my time and drive carefully. Here is an example of what that new fitness landscape would look like in the winter during bad weather, along with an example of a dynamic fitness landscape moving from one point to another as conditions change.

Source: Randy Olson and Bjørn Østman

There are a number of things that would dramatically change the fitness landscape that I just showed you in regards to leaving for work and trying to arrive as quickly as possible. The time determines the rate of traffic, so if I leave at 4:00 AM instead of 6:20 AM I would have lower traffic and higher fitness with route B (it would move left into the orange area). The decisions of other drivers also matter a lot. If every driver were to suddenly get the urge to take route A (perhaps in pursuit of avoiding heavy traffic) they would end up shifting the options where route A no longer becomes preferable due to the traffic increase.

These conditions also have a level of day to day variation that cannot be effectively mapped. Something that isn’t present on the fitness landscape (say a wreck or train blocking traffic) would change the fitness of each route in a one-time event that wouldn’t transfer to the next day. Traffic jams are notoriously hard to predict given the complexity of the system and each driver has unique adaptive decision rules that would lead the pure “fitness grade/score” of any route to change every day and every minute of the drive. Think of it like the shifting of live betting lines in the middle of a game.

In the application to scouting and team building, we have to remember that the environment is constantly shifting, even if not by leaps and bounds. There is also a large dependence on the interaction with all the other individuals within the system. If everyone goes towards smaller wide receivers who are speed-oriented? There is a cost to being late to adapt and trying to match them with too big and too slow cornerbacks who can’t cover as effectively. The change in kickoff rules to make all fair catches come out to the 25-yard line will change the fitness landscape for any player who has a special teams role or a spot on the roster that is impacted by it.

Is this new kickoff rule bad for the game? Maybe. Maybe not. The important part for us is understanding how teams will use this to adjust their special teams units and how quickly we can adapt to maximize the fitness of any player who is playing a special teams role or occupying a roster spot.

The other aspect to fitness landscapes that we can use to question our assumptions about player populations is the idea of local and global optimums.

Assume we start our population at some point on the fitness map. In every direction, we will face some possible impact on our overall fitness. Perhaps we have very low fitness and moving left, up, or down would keep it the same, but moving right would increase it. We move right. We face a new situation where moving left (returning to our start point) reduces our fitness, moving up or down keeps it the same, and moving right again increases it. We move right again.

Are we getting to the highest possible peak on our fitness map? It makes sense within the confines of our options to continue moving right, but we are strictly moving towards the highest peak around us rather than exploring the landscape as a whole and trying to understand our map in totality. Here is how this situation could look if we made a fitness map where we have pressures pushing us towards a local peak (local optimum) rather than a global one.

In this scenario, we won’t know about the existence of this global optimum far away from our current population because the “fitness valley” separating us from it makes it seem like a fool’s errand to travel that way rather than take the immediate move upwards in fitness. From our perspective the global optimum could simply be a mirage, an optical illusion of a desert oasis that would kill us if we pursued it, but we can’t know either way unless we are willing to take the pain of crossing the fitness valley. This is the same valley a team has to cross when they take a player who is abnormally far into the outlier category. The tiny defensive tackle with rare athletic ability may be on the equivalent of a distant uncharted island and all our map says is “here there be dragons”.

This isn’t much different from the idea of exploration vs. exploitation, which is the question of whether to optimize based on known information (what we know about our local fitness peak) or to explore in pursuit of alternatives that may provide a more fruitful solution (seeking a global fitness peak). There’s no right or wrong way to approach the problem, but there are two things inherent within the system that make it hard to take the exploration route rather than the exploitation one.

First, anyone in any position of real power to make decisions and influence players is putting their job on the line in going out and trying assumed “lower fitness” strategies in seeking one that may actually be higher fitness. It would be a risky affair and one that could very easily not pan out. For most people that simply isn’t something worth trying to do. The cost of exploration is potential death, while the payoff is a chance at a slightly higher peak.

Second, there is a large degree of lock in that occurs as you climb from one level of the system to the next. A high school coach has a wide variety of athletes, both because they are subject to a wider range in quality and because they are pulling directly from the floor of the population with minimal prior selection. As you climb the ranks to college or pro levels there is previous selection changing the shape of the population. There may be a few small athletic defensive tackles in the college ranks, but earlier selection pressures have reshaped the population in a way that may make exploration harder for those further down the line.

Going the Way of the Dodo

“Life is a copiously branching bush, continually pruned by the grim reaper of extinction, not a ladder of predictable progress.”

– Stephen Jay Gould, Wonderful Life

If we were to take the IUCN Red List of Threatened Species scale and map it to player types, where would all of our types fall? The old school kicker-lineman hybrids of the 1950s would be extinct. The slow linebacker is endangered, clinging to a role as a run stuffer that is slowly being chipped away by the continued expansion of the passing game. And perhaps one of the most glorified positions of all, the fullback, is critically endangered.

Even in the modern form, the fullback has shifted away from being the violent run game duelist who would blast open lanes towards a hybrid multi-positional player who moves around and brings unique athletic traits that old school types often lacked. We are talking about a type of player who is fundamentally different from what the position was 20 years ago.

The old fullback was perfectly optimized for his environment. He was a marvel of power, physicality, and sheer will that would be hard to find in many players in today’s game. A player like Lorenzo Neal, an imposing 5’11” and 255 pounder who could send linebackers backwards in a daze, embodied what it meant to be a blocking fullback during the 2000s. These players were at a peak, a high level of play that suited their niche wonderfully. What happened?

Here’s Gould discussing what can happen to even the most perfectly optimized species.

It’s worth reiterating a point made near the beginning of this article. It is the environment that determines success or failure of a type or individual. Sometimes the environment moves in an unforeseeable way, a probabilistic dip or dive caused by small historical contingencies piling up on each other. Sometimes the environment moves because of a small tip in selective pressures changing the long term path of a species. And sometimes it faces a catastrophic event, a singular change that essentially erases all species who find themselves in the unfortunate position to be erased, the equivalent of a pedestrian walking in front of nature’s car.

The extinction of an older type of the fullback is no different than building a masterpiece of a diner or motel along Route 66 just as the interstate passes you by and puts you out of business. It would be through no fault of your own that you aren’t suited to compete in the new environment, but rather simply that the change is larger and more encompassing for the entire system than anything you could have done.

One of the common misperceptions about extinction and the development of certain species is the idea of a form of progression, a “tree of life” that improves every step of the way towards some optimal state where the highest branches are the beautiful and inevitable amalgamation of all that came before, and that all the forms of the past were somehow leading to this very moment in an inexorable way, an unstoppable march towards something uniquely special. But any replay of the tape with minor alterations can change lineages, removing current species in favor of those they beat out.

The famed flightless bird who is the origin of the saying “going the way of the dodo” was no more destined to die than the old type fullback was. It did not die to give rise to “higher level” creatures that we think of when we imagine the top of a tree of life. Rather, it died due to a change in the environment that could just as easily not have happened if the course of human history took a different turn. Consider this excerpt from a National Geographic article on the demise of the bird.

“No single cause drove the dodo into extinction. Humans hunted the naive birds, of course, but the rats, cats, pigs, and other animals that we brought along with us were just as destructive. The extinction of the dodo was not simply a matter of systematic extermination. Our species created a major ecological disruption that many unique island species could not cope with.”

In Wonderful Life, Gould created a model to showcase a more accurate representation of the paths of species as they pass through evolutionary time. A large part of the process is the death of different types, many of which would go unobserved if you were to look backwards from the present with only the existing types in mind. Whether by randomness or selection, there is a big graveyard of varying species for every one that we see today. It is not a failure of a species, but rather a shared characteristic of almost any of them.

The path of any type of player surviving or branching off in the complex NFL ecosystem is tricky. On one hand, it can be assumed that any survivor or new type that has passed through a given time is particularly suited for the environment. But there is also human intervention, the stubborn veteran coach or scout who believes “this is how the game should be played” and imposes their ideals and mental models onto a system that punishes them for it.

It is my view that, as far as the game goes, extinction is no different in terms of its usefulness and fascinating qualities than the development of new types. The death of the blocking fullback is not some grand indictment of the standard of the game as it exists today. It is simply part of the inevitable cycle of evolution, the replacement of old forms with new forms, and the continued drive towards higher fitness in a constantly changing environment.

There’s no shame in an older form losing out to a newer form. The new form plays with the equivalent of a home field advantage - it has been molded by it. The extinction of those old forms may initially seem confusing or disruptive to the system, but it is an inevitability in a system with finite resources and limited space. No amount of clinging to them will prevent the end of the line if the environment dice lands accordingly.

Peacocks, Oxpeckers, and the El Farol Bar

“The sight of a feather in a peacock’s tail, whenever I gaze at it, makes me sick.”

– Charles Darwin, in a letter to colleague Asa Gray

There is no experience more humbling than to recognize the size and depth of a given system, the intricate details that are unknowable, unquantifiable, and incalculable. When reduced to the micro level outcomes that occur, the role of each piece and the complexity of their interactions quickly becomes evident. Take for instance, the selection of former Penn State quarterback Sean Clifford in the fifth round of this past draft. What caused Clifford to land in this spot and pack his bags for Green Bay?

He was a four year starter. He compiled 32 wins to just 14 losses as a Nittany Lion. He passed for over 10,000 yards in total over his career but only crossed the 3,000 yard mark once in a single season. He received an invite to the NFLPA Bowl but not the NFL Combine. He had a good pro day performance as far as athletic testing goes, running a 4.62 40-yard dash and posting an impressive 6.84 3-cone drill. He measured at 6020 and 218 pounds. He had a pre-draft visit with the Packers and it apparently went very well.

Which pieces of information here are important? Is it his resume as a four year starter and winner? Is it playing in the Big Ten, perhaps the second best conference in college football? What went on internally in the Packers pre-draft process to isolate Clifford as worthy of a top 150 selection? Was it a few strong voices in the room or a consensus? Did the area scout have a degree of conviction that influenced the rest of the room? Did the pre-draft visit convince the team he was the right choice? Where did all the other quarterbacks sit on their board and what would have happened if a few slid down the board? And if we extend the scope outwards, what had to happen for all 31 other teams for Clifford to not be selected in the prior 148 picks? The teams picking ahead of the Packers and their need (or lack thereof) for a quarterback influenced the path Clifford took to land where he did. Their evaluations of Clifford as a prospect certainly influenced it. Perhaps if the Packers had taken off Clifford’s card and thrown it to the other side of the room he would have slid, and slid, and slid.

Understanding a system like this is not an easy task and predicting it in granular detail is basically impossible. But a false perception of understanding is worse than recognizing and stomaching the difficulties of the system.

The Peacock

Darwin faced a similar difficulty of understanding when he looked upon the peacock, a beautiful bird by the standards of any onlooker, but a grotesque and confusing sight through the lens of natural selection. It is a bird that sacrifices adaptations of speed, strength, or camouflage in favor of a big burlesque-like veil of colors. It is a showcase bird. But there must be some underlying reason why a bird would bypass the critical traits to improve its survival in favor of a fashion statement. Even if Darwin, the baffled observer, couldn’t figure it out initially. Here is professor Michael J. Ryan on the peacock.

Darwin could have just as easily taken the approach of many observers through the history of science. He could have papered over the gaps by exclaiming that the peacock was on a path to extinction, that it was an obscure and useless animal that was one of the few to violate the laws of natural selection, or any manner of wishful thinking to explain away the phenomena of a slow and easy to kill bird existing at all. It would have been a mistake.

Failing to understand the system doesn’t mean the system is not functioning properly. We are limited observers doing the equivalent of looking at the galaxy of information through a microscope. But just because we don’t have a complete understanding of the system doesn’t take away from the system’s value in determining outcomes.

Failure to comprehend the importance of the peacock’s tail, a product of sexual selection, doesn’t make the tail any less impactful. The peacocks with the best tails are doing the equivalent of handicapping themselves to demonstrate fitness, producing boisterous displays that require a large amount of energy and resources to maintain. It’s the equivalent of pulling up in a six figure sports car. There’s not much more utility in the sports car than a regular car beyond signaling access to resources, but it’s a convincing signal particularly in nature.

The Oxpecker

Before I dive into complex adaptive systems, I want to emphasize that intuition and experience does still have a valuable role in deciphering patterns and relationships, even if it deals with some severe limitations. Take the oxpecker as an example, a bird that regularly takes rides on animals like zebras and rhinos.

An oxpecker and its host

The rhino is often covered in flies, ticks, and other parasites which provide a plentiful food resource for lingering oxpeckers. While there is certainly some element of mutualistic symbiosis between the two, the oxpecker functions much like a parasite that takes advantage of the larger animal. It waits to eat ticks until they are full of blood, essentially maximizing servings without doing much good for the rhino. It is also known to keep open wounds and feed on the blood of rhinos, another parasitic behavior that makes the oxpecker look more like a foe than a friend.

Researchers wondered about what the oxpecker brought to the rhino given this behavior. If the birds are doing little more than picking at wounds and eating engorged ticks, what is their usefulness and why wouldn’t a rhino simply shake them off? They studied the relationship between these animals closely and realized a connection. When there is an oxpecker, it's hard to sneak up on a rhino. But take away the oxpecker and the rhino becomes quite a bit easier to get close to.

What they discovered is that the oxpecker acts as somewhat of an alarm system for the rhino. The rhino has poor eyesight, but the oxpecker’s excellent vision and hissing reaction to potential threats help the rhino to know when something dangerous or worth avoiding is coming around the corner. Rhinos without oxpeckers would only spot an approaching human 23% of the time, but a rhino with an oxpecker would spot the human every time. Needless to say, the researchers discovered something deeply important about the interactions between the animals.

What does that have to do with intuition and learned experience? Researchers believed their work to provide new insights to the relationship of the animals, but it wasn’t particularly surprising to the locals who have been around the oxpecker for centuries.

They call the oxpecker “Askari wa kifaru”.

That’s Swahili for “the rhino’s guard”.

Dealing with complex systems can be incredibly humbling for received wisdom and expertise. Let the rhino’s guard serve as an example of where it can still shine.

Complex Adaptive Systems

“Complex system: a system in which large networks of components with no central control and simple rules of operation give rise to complex collective behavior, sophisticated information processing, and adaptation via learning or evolution.”

– Melanie Mitchell, Complexity: A Guided Tour

It took me about six months to start to come to grips with complexity science and its definitions. Perhaps I’m a slow learner, but even those within the field have a hard time wrapping their arms around a pure definition of what complexity is.

In her book Complexity: A Guided Tour, Melanie Mitchell discusses a panel at the Santa Fe Institute that has come to be the heart of complexity science. A young scientist asked the panel of scientists a simple question - “How do you define complexity?” - and every panel member gave a slightly different answer. It’s hard to bridge the gaps because complexity concepts can be imported to almost every living system, all unique in the way they interact with each other.

Let’s observe the opening quote to this section from Mitchell and try to map it onto a system like the NFL that we are familiar with.

The scope we decide to zoom into matters as new behavior emerges from smaller independent parts (for instance, the emergence of a team grade from a room full of individual scout grades), but let’s go step by step.

Large networks of components. Think of this like a collection of teams, a room full of scouts, or a position group of prospects. Ignoring the scope problem, all of these components have varying levels of impact on each other and interact in a way that generates new phenomena - like a scout formulating a grade on a position group. Another common term instead of components is “agents”.

No central control and simple rules of operation. This network of components isn’t guided by a single driving force, a king shouting down commands from his throne. Each one follows its own independent rules to guide it. You may characterize one scout’s rules of operation in formulating a grade as different from another scout’s, for instance. Each creates different results when applied to other components or agents in the system.

Giving rise to complex collective behavior. Think of complex collective behavior as the formulation of consensus grades (a hypothetical “league grade” or an specific team grade emerging). A great example of this in nature would be an ant colony emerging from the individual operations of ants performing specific tasks.

Sophisticated information processing. Think along the lines of any conclusions or assessments generated from those “rules of operation”, perhaps the way two scouts would deliberate over a player and generate questions, ideas, and answers when interacting in a way that wouldn’t be possible if each worked individually.

Adaptation via learning or evolution. The system is not static. Suppose a team drafts a player of a certain type and he fails spectacularly. They will remember that when they are up again and a player of the same type enters their field of view. The components (or agents) within the system will take in new information and adapt accordingly to have a better chance of any pursued outcome in further iterations.

Here is the difference between a complex system and a complicated system, at least in the eyes of complexity science.

If I set your phone on the ground, run full speed, and kick it as hard as I can it’s going to break it. All of the little finely honed internal pieces that allow this wonderful device to function are not adaptive. They are set in place with no self organizing properties - they have to be carefully aligned and connected. And my kick is going to make it stop working.

Alternatively, I run full speed and kick an ant colony as hard as I can. Sand, dirt, and ants go flying in the equivalent of an atomic bomb hitting the system. Some ants die. Many of their internal structures, assuming they aren’t deep in the ground, will be destroyed. The ants will have to rebound from this catastrophic event. But each little ant does its part in a reassembly of the colony towards a new form post-kick. Your phone, at least in its current form, will never reassemble. It won’t adapt.

I mentioned Sean Clifford going at 149th overall earlier and we can look at that entire situation in the light of these terms. The collection and processing of information on Clifford, the adaptation from past decisions by all teams when they decided to pass on him, the collective behavior of the Packers generating a grade that warranted that pick, the interaction of all teams to take 148 players prior to him, etc. When we think about systems like this it makes it easier to categorize and separate things rather than imposing our own beliefs and opinions like tossing a blanket over the entire thing.

The El Farol Bar

“This is a system in which learning takes place. Agents ‘learn’ which of their hypotheses work, and from time to time they may discard poorly performing hypotheses and generate new ‘ideas’ to put in their place. Agents linger with their currently most believable hypothesis or belief model but drop it when it no longer functions well, in favor of a better one. This causes a built-in hysteresis. A belief model is clung to not because it is ‘correct’–there is no way to know this–but because it has worked in the past and must cumulate a record of failure before it is worth discarding. In general, there may be a constant slow turnover of hypotheses acted upon.”

– W. Brian Arthur, Inductive Reasoning and Bounded Rationality

It’s Thursday night in Santa Fe and you’re debating on going to the El Farol Bar to enjoy a beer and some live Irish music. Here’s the catch. If the bar is too crowded you won’t enjoy yourself there. Your beer gets spilled, the conversations drown out the music, and you will wish you had just stayed home. How do you determine whether or not to go?

This is the question that economist W. Brian Arthur, a regular at the El Farol, wanted to answer. He composed a toy model that goes like this. Let’s suppose we have 100 people all deciding whether or not to go to the El Farol. If more than 60 percent of the possible 100 go to the bar, nobody will have a good time and everyone will wish they had stayed home. If 60 percent or fewer go, then everyone at the bar will have a good time.

These people cannot communicate with each other to decide who will go and who won’t. The only real information available to use for deciding whether or not to go is the past attendance.

Arthur armed the agents in his toy model with a variety of strategies. Some will go if last week’s attendance is below 60, some will use a recent average, some will use results from further into the past, etc. If a particular strategy fails tremendously, the agent will implement a new strategy that has been more favorable. For instance, if “predict the same as last week” fails four times in a row, the strategy may be discarded.

The thing about this problem is that there is no optimal solution or rule that consistently succeeds. If there were, everyone would adopt it, then everyone would either go or not go together (which means the bar ends up 100% packed or completely empty). In the words of Arthur, any “optimal solution” is self-negating. Any given decision rule is dependent upon the interactions with the other rules to be a success or failure.

This problem is fascinating to look at from the context of NFL teams developing unique decision rules based on their past experiences and their observations of others. If all teams were to adopt the same rules for how to select prospects then it would simply come down to who gets the first pick. They would all run the same scheme, pursue the same types of players, and ultimately any outcome on the field would be purely luck. A completely homogeneous system of competitors loses any differentiating value. Put differently, if you and I play poker the exact same way, the only determinant for who wins is who gets lucky with the cards. We would raise, call, and fold in the same situations.

Arthur ran simulations of these agents testing out their various strategies over thousands of “Thursday nights at El Farol”. From week to week there is a degree of variance in the number of attendees, but the long term average ends up at approximately 60% attendance. The reason for this is that any particular move away from 60% as an average would change the success or failure of other decision rules. To quote Arthur from his paper.

“To get some understanding of how this happens, suppose that 70 percent of their predictors forecasted above 60 for a longish time. Then on average only 30 people would show up; but this would validate predictors that forecasted close to 30 and invalidate the above-60 predictors, restoring “ecological” balance among predictions, so to speak. Eventually the 40-60-percent combination would assert itself.”

The adaptive ability of this system to remember, to change decision rules, and to recalculate for all the other agents within the system manages to generate a good collective result. And this happens despite every possible decision rule within the system being dependent on all others. There may be some way to beat the system by attempting to figure out the most common rules of others in the system and create a new rule, but if successful the new rule would be negated very quickly as others adapt to match it.

As Arthur tinkered with the problem he began to view it from a new angle, one of ecology. Each decision rule or strategy played a role in this ecosystem of competitors and one saw varying degrees of success or failure as they played off each other.

“The situation wasn’t so much logical, it was ecological. It’s not trying to forecast the truth. You don’t know the truth. The actual forecast that it's trying to get to, beforehand, is determined by what other people are forecasting. So I began to think that each little forecasting rule was like a species trying to do well in an ecology of other little forecasting species.”

This isn’t much different from the acquisition of talent and building of a team. It’s easy to build a mental model of what an ideal team looks like and try to acquire players of the highest talent to fit within it, but any success will be determined by how your team matches up with all the other teams trying to do the same thing. To find any advantage you would need to understand what they are bringing to the table and how to beat it, which would then lead to adaptation by your opponent if your particular strategy or approach succeeds over and over.

The environment that determines the success of a player or a team-building approach is constructed of all the other players and all the other team-building approaches. Success is not materializing a hot knife to slice through all competition like butter. It is more like the construction of a skeleton key that can open any door it comes across, an adaptive tool that is optimal regardless of the environment, regardless of which doors it comes across. It may need to be reshaped constantly, as the adaptive ability of all others in the system will lead to a constant battle for upgrading and updating.

Experience and pattern seeking allows this adaptive ability to function. The tools that allow the recognition of the oxpecker as the rhino’s guard are the same tools that allow for recognizing success of others, and they are what create a copycat league that self negates in a constantly evolving ecosystem.

One more aspect to keep in mind about complex adaptive systems is that moving from one scale to the next, say from a single team to the league as a whole, is an incredibly difficult challenge. In his book Everything is Obvious, Duncan Watts talks about this under the name “the micro-macro problem”, which is pervasive in any field that has this combination of interacting and adaptive agents or components.

It is impossible to reflect the potential outcomes of 32 teams engaging in their own little bubble of team building decision rules and isolated ecosystems to predict what the league will do as a collective. Sean Clifford will go down as the 149th pick in the 2023 NFL Draft, but we have no clue how much this reflects anything other than him being the highest desired player on the Packers board at the time of their decision to draft him. We don’t know a true “league sentiment”, we don’t know what happens if we replay the tape with minor alterations in the first round, and we don’t know all the little interactions among miniscule parts of the system that led to that specific outcome.

How can we predict something like this or view any league-wide phenomena as something driven by a particular force or agent, like a singular push from an invisible hand? Watts discusses this in the context of a riot model where one minor distinction in the crowd (one that would be imperceptible to any statistical model or outside observer) creates a vastly different outcome. He surmises with this when comparing a crowd that stays passive and a crowd that erupts into complete chaos as a result of this minor distinction.

“This is the micro-macro problem arriving in full force. And the minute you try to skip over it, say by substituting a representative agent for the behavior of the collective, you will have missed the whole essence of what is happening, no matter what you assume about the agent.”

No matter what you assume about the league as a whole, the collective, you are missing the essence of any individual team, any individual decision-maker or information gatherer, or any individual player. The death of the fullback or the emergence of the nickel corner is not guided by some singular agent. It is the collection of all the small interactions creating a widespread phenomena that we paint over with a broad brush that labels trends.

This is an unpredictable, contingent system that is incredibly difficult to understand, predict, or find success in over a long period of time. The best we can do is model it as effectively as possible.

It will be wrong, but it might be useful.

Application and Closing Thoughts

To break from the tone a bit, ending this article feels quite odd to me. It is the cumulation of over half a year of research and approximately two and a half months of writing to completely formulate it. While not all-inclusive, I think the application of all of these ideas towards a better understanding of the complexity of scouting, team building, and the league as a whole provides a unique lens - one that might either be incredibly smart or incredibly stupid.

If I had to leave some key things that I took away from researching and writing this, it would be these. I don’t know if I’ve ever had a unified philosophy of team building or scouting, but if I did these would be some things I’d be tinkering with.

Roster construction and scouting in general should be centered around what is optimal within the environment. It may seem like a common sense thing to some degree, but your own personal feelings about what is “right” matter very little when it comes to making the optimal decision in such a complex system.

For a scout, it would be the well-known approach of trying to scout and evaluate players according to the best fit within the team’s scheme and culture. That’s pretty common knowledge.

For teams as a whole (particularly general managers and head coaches), media analysts, or consultants it is something a little bit less talked about.

It’s a given that creating the team that has the best odds of beating everyone else is the ultimate goal, but the methods of attaining that aren’t as evident. Creating peak fitness depends on understanding your opponent thoroughly and working around what they bring to the table. Creating a running attack and imposing your will only matters if it gives you the best chance of winning within the environment of the rules, all the other teams, and what your team personnel is. The same goes for a passing attack that may get overly aggressive or sacrificial of other dimensions of offensive capability. The most important thing is fitness, not the specific means used to achieve it. You can’t just force your will on a complex adaptive system. There are advantages to understanding how and why others do what they do and you will hurt yourself if you are unaware or uncaring towards them.

For media analysts, the question of what any given opinion does or represents is interesting. It’s easy to evaluate a player based on your own personal experiences and preferences then say “I like him! I’d take him at this pick!” but it is fundamentally different from trying to predict what happens when the player is inserted into the league. Opinions can be valuable and if you get paid for yours that is great. But deliberately separating an individual preference from a broader predictive goal is something that should be done more often - we are all victims to this microscope of personal experience and opinion though.

For consultants, and I think specifically of any work I’ve done with agents in this regard, understanding the system is that much more important when you have minimal impact on it. Assuming the goal of an agent (or someone working on their behalf) is to be a pure profit maximizer, then it does not matter what their individual opinions or perspectives on players are. The only thing that matters is whether or not they accurately understand the system (either through theorizing, researching, or having connected sources that can help to navigate it). It’s understandable to miss when dealing with uncertainty and time horizons like those agents deal with, but being a profit maximizer and outcompeting others requires this humility to shoot for what the league wants, not what one wants the league to want.

While markets are generally efficient, I think the ideas particularly from artificial selection and the marble racing example of the Matthew effect do a good job illustrating where some undiscovered value may be. It would be difficult to really uncover that given the aspect of selection only being able to act upon variation within the population (and especially with the limitations of those not making selections within the league), but there is probably value to be found when decision-makers are slow to adopt new ideas or kill off old ones that aren’t efficient.

There is a saying I saw while doing this research that went something along the lines of “new ideas don’t beat out old ones, the proponents of the old ones just eventually die out”. I suppose the league’s evolution year over year goes along similar lines. A coach or scout with a certain ingrained set of beliefs and ideas may need to adapt to stay in the game, but those who cannot make those adaptations will eventually be kicked out of the system after enough failure.

I think the idea of fitness in the evolutionary sense could be a particularly useful idea to apply whenever thinking about players, regardless of whether it’s fit for a team, an all-star game roster, the league as a whole, or even being represented by an agent (depending on risk tolerance, resources, etc).

I don’t know if a true fitness score could ever be determined and applied in the sense of like a 0.9 = a 90% chance of success (however you would define that). It would be an interesting concept to tinker with. It’s much more impersonal and identity-extinguishing than something like a hard grade with minimal room for flexibility. Every player exists in this range of fitness. Andrew Luck may be a 0.9 (keeping room for catastrophic injury open), but a player who is small, slow, lacks instincts, not tough, etc. may be a 0.001. Perhaps trying to arrange them according to something like this could be a useful exercise.

Generally being adaptive as a mechanism for improving fitness and survivability, or creating resilience, is another thing that stood out tremendously to me in this. The plasticity of a type to mold again and again to new niches is the ultimate answer to continuing to exist within a system. You may be the most optimal or the most successful at a given time, but if you become dependent on that and all the surrounding conditions change you will die. Perhaps this is why so many attempts to create the Patriot Way elsewhere have failed (plus the lack of a hall of fame quarterback). You can’t simply copy and paste what was successful in one place or time to somewhere else and expect it to work again.

The other side of that coin is that you can’t always expect what failed in one place or time to continue to fail indefinitely. A project quarterback with rare physical tools may be drafted early and fail three years in a row, but it doesn’t mean that type is forever doomed. The same goes for any type going in and out of style within the system. Each has independent probabilities of success or failure based on a wide variety of influences and to reduce them in a trend-like manner falls into Watt’s micro-macro problem trap of generalizing a bunch of independent agents as one "collective" representative agent.

The last thing I’d mention, and perhaps the most striking to me, is the role of chance in any given outcome. It’s easy to banter about the “what if” scenario of this or that happening and changing the course of league history, but the totality of cumulative advantage (as reflected by certain types prospering and becoming more prominent), contingency (like the substitution rules emerging), and the experimentation of what could happen on any given replay of the tape is something I had never deeply considered. I never wondered why offense and defense split the way they did, I just took it as a given. We all come to develop certain beliefs about what is right or wrong, what should or shouldn’t fit in a system, but our comprehension of the narrow trail it took down an infinitely broad tree of possibilities is truly abysmal.

If nothing else, this contingency based on probabilistic outcomes that were never guaranteed is humbling. It reminds us to keep an open mind about not just the “what” but also the “why” behind anything that happens across the league. We aren’t trying to grab the universe and squeeze it until it looks how we think it should, at least not in any sense that I can imagine someone involved in football would. We are trying to observe carefully, understand it, and perhaps if we do a good enough job we can outperform others within it.

I’ll leave you with an excerpt from Superforecasting of Annie Duke discussing humility.

“You have to have tremendous humility in the face of the game because the game is extremely complex, you won’t solve it, it’s not like tic-tac-toe or checkers,” she says. “It’s very hard to master and if you’re not learning all the time, you will fail. That being said, humility in the face of the game is extremely different than humility in the face of your opponents.”

Resources

Articles

The People Who Saw Evolution (Grants/Finches) - Princeton

De Groot/Chess - Wired

Armed and Advantageous - Emlen Montanan Article

Evolution of the NFL Player (operations.nfl.com)

Evolution of the NFL Rules (operations.nfl.com)

London Underground Mosquito Evolution - Smithsonian Mag

Rat Genetic Differences in NYC - The Atlantic

The Beer Hall Putsch - History.com

Substitution Rules - Football Archaeology

Vampire Finch - NYT