Mark Jarvis
- May 8, 2023

Jenga Blocks of Belief: Diverging and Updating

Updated: May 9, 2023

General ideas and concepts in this article

Good forecasting ability requires a willingness to change beliefs with minimal friction from identity or previously stated beliefs.

It’s naturally hard, both internally and in the social systems we live within, to change our beliefs without facing negative ramifications.

The more someone invests “being an expert” into their beliefs and identity, the harder it is for them to adjust.

Confidence is often not a useful indicator of judgment quality and should be considered very lightly.

Being a contrarian can be necessary at times, but it is important to bring it with measured skepticism rather than blind faith in oneself.

Defining domain and clearly specifying the ceiling of predictability matters, as we hurt ourselves when we mask predictions in vague uncertainty hand-waving.

Stating our beliefs explicitly and seeking options with more resiliency may help us to update properly to new information and feedback.

“Beliefs are hypotheses to be tested, not treasures to be guarded.” - Philip Tetlock, Superforecasting

“Discovering you were wrong is an update, not a failure, and your worldview is a living document meant to be revised.” - Julia Galef, The Scout Mindset

First Rounder EDGE38

CBS Sports’ Emory Hunt started a firestorm when his statement of new Kansas City Chiefs pass-rusher Felix Anudike-Uzomah being the 38th best edge rusher traveled from the listening audience of the Ross Tucker Podcast to the airwaves of social media. It was a statement that infuriated many and emboldened many. On one side, detractors argued that the opinion was worth ignoring as the NFL clearly did not see things the same way. On the other side, supporters argued that Hunt’s willingness to beat back groupthink and stand on his beliefs was admirable and useful.

This article is not intended to be a dig at Hunt, who does an incredible amount of work digging into the weeds on small school prospects. However, his work is an example of where we have to ask key questions about the goals someone has, their willingness to update their beliefs, their sensitivity to outcomes, and their identity when it comes to how they evaluate talent. Others are not exempt from these questions. But usually the social pressures involved in giving opinions publicly with that type of platform provide an incentive to push down dissent.

But let’s rewind. How do we get to Felix Anudike-Uzomah as the 38th best edge rusher?

The first step to trying to figure out Hunt gets to the conclusions that he does is trying to understand how he views players. What parts of the profile does he prioritize or ignore? Which traits is he looking for? How hard or light is he on players when evaluating them, and does his grading reflect certain indicators about how he views the game?

Sensitivity to Size

One of the most notable aspects of Hunt’s style of grading, and he is very adamant with this on social media as one of his “Emory-isms”, is that size is not a skill. What players have to do on the field requires skill and being tall or heavy does not match up with those skills.

This is in direct contrast to many perceptions from those within the NFL. The idea of “planet theory” carries weight when you consider the types of players who receive NFL opportunities. There are only so many people on the planet who are a certain height, weight, and speed. For instance, being a reasonably in-shape 6’7” and 320 pounds with long arms is a rarity, and as such players who fall into that bucket tend to be valued by the NFL regardless of their current skill set.

While I’m not quite sure how Hunt values speed by comparison (he had Deuce Vaughn and Keaton Mitchell as top 5 backs, which seems like an indicator of valuing athleticism), it can be safely said that he does not worry about the size of a player. Let’s compare against the size of backs who were drafted and signed as UDFAs in 2022.

RBs over 5’11” and 210 pounds Drafted: 8/30 (26.67%)

Signed as UDFA: 11/30 (36.67%)

Total on 90: 19/30 (63.33%)

RBs under 5’11” and 210 pounds

Drafted: 4/84 (4.76%)

Signed as UDFA: 15/84 (17.86%)

Total on 90: 19/84 (22.62%)

Put one way, any back without differentiating information who is over the 5’11” and 210 pound threshold is about 2.8 times more likely to be selected in the draft or sign as an undrafted free agent compared to those under that threshold. Even though the camp roster split was about 50/50 last year for players above and below that threshold, being bigger improves your odds compared to your peers.

While we would be able to judge how Hunt’s underlying beliefs about size are reflected better by his rankings, I’m going to pass on that as a result of time limitations. Let’s assume them to be true based on what he has said outwardly and the way he talks about players.

We can, however, assume that the NFL prioritizes size in a way that he does not and that it plays a major role in the disparity between his rankings and the results of how players are valued by the league.

Sensitivity to Level of Competition

Another notable aspect of Hunt’s grading is his willingness to intermingle FCS and sub-FCS players in a way that few scouts or league decision-makers would be willing to do. A quick look through Hunt’s Twitter timeline would reveal reports that included players like Northwood’s John Hall and Tennessee-Martin’s Colton Dowell ranked among his top receivers.

Hunt’s Youtube Channel Football Gameplan has plenty of older videos ranking positions and it doesn’t take long to find some outlier opinions where he ranked sub-FBS players much higher than anyone else in media or presumably within the league was willing to do.

This is another area where Hunt and the league (and even most other media outlets) are diverging significantly. While the best FCS and sub-FCS players tend to make their way into the draftable ranks, they are a very rare commodity and tend to be ranked higher in Hunt’s work than is reflected in the outcomes of who the teams select in the draft or sign as undrafted free agents.

For instance, here is a comparison between the power five players, the group of five players, the FCS players, and the sub-FCS players in last year’s class as far as how many were drafted or signed as undrafted free agents.

Power Five

Drafted: 182/669 (27.20%)

Signed as UDFA: 235/669 (35.13%)

Total on 90: 417/669 (62.33%)

Group of Five

Drafted: 51/541 (9.43%)

Signed as UDFA: 113/541 (20.89%)

Total on 90: 164/541 (30.31%)

FCS

Drafted: 19/513 (3.70%)

Signed as UDFA: 46/513 (8.97%)

Total on 90: 65/513 (12.67%)

Sub-FCS (Division 2 or below)

Drafted: 5/359 (1.39%)

Signed as UDFA: 24/359 (6.69%)

Total on 90: 29/359 (8.08%)

This is an even steeper situation than size. The level of competition a player faced has one of the largest (if not the largest) impacts on their outcomes entering the NFL with only speed really in the same ballpark. Using this data from the 2022 class, the odds of any FBS player being drafted or signing as a UDFA is about 4.5 times more likely than any sub-FBS player.

That is not to say that there aren’t outliers. There certainly are. But this is another example of an area where Hunt’s approach to evaluating and ranking talent differs significantly from the league.

Sensitivity to NFL Decision-Making

Another aspect of Hunt’s decision-making process appears to be that he does not care where a player gets drafted and does not view it as an indicator of their talent. He is not focused on matching league perceptions or projecting where players will be selected, but instead focuses on how he views the player.

This is an admirable goal to strive for, but there are some notable questions that come with this line of thinking. Does where a player goes truly have no bearing on their talent? What is valuable about being wrong by yourself rather than wrong in a group? Why does the opinion of those in the league differ so much? Are independent evaluators on the media side going to be right more than teams spending millions each year to pay staff who sift through prospects and evaluate them for an NFL projection?

We also have to recognize that to some degree, talent and what the league thinks/does are intractably linked together. If there is talent, the league has every incentive to capitalize on it barring a player having medical or character problems that would prevent them from being able to play professional football. If the league is not fielding the talent, will they go to other leagues and blossom to the highest degree? If true, we’d expect to see players who were ranked highly by Hunt going to the USFL, XFL, or CFL and performing at a level that would set records.

This leaves us in a precarious position where identifying talent in a way of “absolute truth” is nearly impossible.

On one hand, we could assume that Hunt is wrong and that the leagues have correctly identified the talent of the players as being very low. The leagues are acting efficiently and accurately in their decision-making process, regardless of what Hunt believes.

On the other hand, we could assume that Hunt is right and that the leagues have incorrectly identified the talent of the players. But we can’t know whether or not it’s right, because these players do not get to see the field. If I say a quarterback is the best in the class only for him to never receive an opportunity to suit up and lead a team on Sunday, I can simply mark it down as a failure of the league to use his talents rather than a case of failure where the player made it to the stage and bombed.

NFL Oversensitivity?

Another question we could pose in Hunt’s defense is asking, “is the NFL overly sensitive to information like size and level of competition?”

While I don’t have the time or resources necessary to pursue this end and attempt to answer this question, I believe it would go something like this. Say we want to figure out if the NFL is overly sensitive to the level of competition that players face - what is the best way to do this? Perhaps we could try to figure out how the average FCS NFL player performs compared to the average FBS NFL player to see whether or not the gap is gone once we reach the NFL. We could try this same approach with size or speed or any other parameter.

The problem that comes to my mind with this approach would be that there is a form of selection bias in the types of players that make it to the NFL from both of those levels of competition. Generally speaking, the FCS players in the NFL have been handpicked because they are some of the best players available from that level of competition.

We could say, “Ah yes, but only the best FCS players get a chance whereas the good but not great ones don’t,” which leads us right back into our problem of the intractability between the NFL’s decision-making and the talent that makes it through the filter. We can never rerun history to ask, “if we replaced half the league’s FBS players with FCS players who did not initially make it, what would happen?” and that makes trying to judge any form of NFL oversensitivity to level of competition, size, speed, etc, a difficult task.

However, if we were to compare the measurable speed and overall athletic testing data of players based on which level of competition they played at, we would find a noticeable shift towards slower, less athletic individuals as we dropped from the highest levels to the lowest. If we were to reduce the quality of a player to their speed, for instance, we would be much harder on most sub-FBS players than on FBS players.

For instance, here is the number of sub-FBS backs in 2022 who ran faster than a 4.50 40-yard dash compared to the number of FBS backs (backs who did not run excluded).

FBS: 17/89 (19.10%)

Sub-FBS: 7/57 (12.28%)

If we were to repeat this test with other athletic testing data we would continue to find similar results where players who are the FBS level consistently outperform sub-FBS players on measures of athletic ability. This is an imperfect test to capture potential NFL oversensitivity, but it illustrates one of the reasons we have to doubt that the NFL is overestimating FBS quality.

Another way to look at it would be comparing the performance of FBS teams against FCS teams in terms of game outcomes, which is indisputably in favor of FBS teams. A case could be made for greater resources or better coaches, but the record of FCS teams when facing FBS teams is expectedly abysmal.

According to that source for FCS historical results vs. FBS competition, the FCS has gone 486-2842, which is winning about 15% of games. We see this gap in talent play out regularly each week early in the college football season as FCS teams get plastered by double digits over and over by superior competition.

Confidence

One of the crucial questions we have to ask is why Hunt feels so strongly about his beliefs on how he evaluates talent and whether or not he should feel justified in feeling that way. This is not a knock on Hunt specifically, as this question of confidence should apply to anyone who wants to make predictions about how the world will turn out and for what reasons it will do so.

There is a dilemma of confidence’s usefulness whenever we look at someone’s gauge of their own abilities as an evaluator (I am not exempt from this). The confidence of an individual tells us little about their track record but instead the narrative that they have formed in their mind about their decision-making processes and their acumen within the domain. Hunt can be highly confident in being anti-size and anti-level of competition, but it does not indicate whether or not he is consistently performing better (or worse) than teams making decisions.

“Subjective confidence in a judgment is not a reasoned evaluation of the probability that this judgment is correct. Confidence is a feeling, which reflects the coherence of the information and the cognitive ease of processing it. It is wise to take admissions of uncertainty seriously, but declarations of high confidence mainly tell you that an individual has constructed a coherent story in his mind, not necessarily that the story is true.” - Daniel Kahneman, Thinking, Fast and Slow

Klarman’s Calculator

Michael Mauboussin’s book, The Success Equation is a treasure trove of ideas, and it touches on many different areas of life from sports to business to investing. One of the passages focused more heavily on investing quotes hedge fund manager Seth Klarman in regards to effective value investing. We can view this through the lens of the contrarian (Hunt) having a divergent opinion and the calculator (the league’s past decision-making) keeping us from going against the grain constantly to bridge the gap in making good choices.

“Klarman has a wonderful line: ‘Value investing is at its core the marriage of a contrarian streak and a calculator.’ He’s saying that you have to be different from others and focus on gaps between price and value. This idea extends well beyond the world of investing.

When most people come to believe the same thing, large gaps open up between price and value. That’s what happened during the dot-com euphoria of the late 1990s and during the spring of 2009, when despondency established the low point for the market. The first part of Klarman’s line properly emphasizes the importance of being willing to go against the crowd. Most people know that it is more comfortable to be part of the crowd than to be alone. But it’s also hard to distinguish yourself if you’re doing the same thing as everyone else.

Skillful investors heed Benjamin Graham’s advice: ‘Have the courage of your knowledge and experience. If you have formed a conclusion from the facts and if you know your judgment is sound, act on it—even though others may hesitate or differ.’ However, Klarman correctly observes that it is insufficient to be a contrarian because sometimes the consensus is right. The goal is to be a contrarian when it allows you to gain an edge, and the calculator helps you ensure a margin of safety.” - Michael Mauboussin, The Success Equation

There is value in holding a different opinion when it gives you an advantage, but it is crucial to not build this contrarian streak into your identity at the cost of accuracy or profit. When you diverge significantly from the group, it is important to ask yourself this - what information do I have that the group does not, and why would that lead to me making a better or more accurate prediction than the group?

It is not enough to stand confidently and say that you would rather be wrong by yourself than wrong in a group, assuming you are hurting your chances of standing right in a group.

Small Percentages, Big Differences

One common rebuttal I’ve seen against detractors of Hunt’s work is that the NFL isn’t quite good at this thing anyways because so many players who are drafted highly ultimately flame out. While I respect the sentiment as a means of injecting some level of humility into league opinion, it is one that I take a lot of issue with. But first, here is an example of the defense of Hunt as reflected by criticism of the league.

Let’s be clear. The fact that a significant number of players do not perform up to expectations is not because the NFL is somehow inept at judging players or that they should be in the ballpark of 90% or better.

Jarvis Jones, while a poor tester at his pro day, posted 44 tackles for a loss and 28 sacks in two years against SEC competition. Devin Bush ran low 4.4s, tested incredibly well overall, and was a 2019 PFWA All-Rookie team selection before a torn ACL in 2020 severely hindered him. Jones has a much greater case for an error in the selection process, but Bush may simply be a victim of injury derailing what was originally a promising career.

You could have put Patrick Mahomes as your top quarterback in his class only for him to tear his Achilles in practice as a rookie and never play meaningful football for the Kansas City Chiefs. Were you wrong? Of course not. That outcome would have been unknowable in advance, a cruel twist of fate for a great player. The difficulty of the domain matters when trying to assess a “hit rate” and to throw out numbers like 40% or 90% without recognition of that fact is to downplay the importance of what is even achievable if we don’t have a crystal ball.

Another example of domain specificity mattering for judging the value of a % hit rate would be to look at something like venture capital. If you were to look at purely the overall profit relative to other benchmarks, savvy venture capitalists can make huge returns. But a look at any one portfolio’s “hit rate” can be quite disappointing. Venture capital firms parse through large numbers of potential opportunities just to get small bites of potentially rising companies, and even those they do bite on play out with a low rate of success.

“Our survey found that for each deal a VC firm eventually closes, the firm considers, on average, 101 opportunities. Twenty-eight of those opportunities will lead to a meeting with management; 10 will be reviewed at a partner meeting; 4.8 will proceed to due diligence; 1.7 will move on to the negotiation of a term sheet with the start-up; and only one will actually be funded.” - Harvard Business Review, How Venture Capitalists Make Decisions

Then comes the aspect of actually making a return on the investment. Given the failure rate of businesses, it is no surprise that a small number of big winners is what drives the profits of venture capital firms. To quote a 2012 Wall Street Journal article quoting the research of Shikhar Ghosh, a lecturer at Harvard Business School at the time.

“There are also different definitions of failure. If failure means liquidating all assets, with investors losing all their money, an estimated 30% to 40% of high potential U.S. start-ups fail, he says. If failure is defined as failing to see the projected return on investment—say, a specific revenue growth rate or date to break even on cash flow—then more than 95% of start-ups fail, based on Mr. Ghosh's research.”

If you were to chalk it up as just 5% of venture capital investments being profitable? You may laugh at their ineptitude, ask who is funding these knuckleheads, and then carry on your way thinking about how much better of a job you would be doing if you had the keys to the car. But none of these companies are crushing the competition and running away with a perfect portfolio - why not, if all it would take is a savvy investor like yourself to hit just 30% or 40%?

Because the domain is inherently that difficult. It is not a 50/50 proposition, or even a 20/80 proposition. The way they turn a profit is not by performing at an impossible crystal ball level in a world that loves to smash crystal balls, but instead by taking their investments very carefully and having enough of an outsized return to be profitable. If you judged them purely on the low % of hits, you would miss the profitability.

But distinguishing between a venture capital firm that hits on 1% of its investments against 5% may tell us whether that firm is running happily to the bank or going out of business. The percentage difference may be small, such as would be the case between a team that “hits” on 35% of their draft picks vs. 40% of their draft picks, but it would lead to starkly different outcomes as far as the success of the team. This is what gets missed when we throw around percentages for success without qualifying with the context of how hard it is to be right in the given circumstances.

To wave our hand vaguely at “teams miss a lot” as an excuse for our own decision-making processes leading us way off consensus or producing abnormal results is to veil ourselves in ignorance.

Jenga Blocks of Belief

The majority of this article has been spent dissecting the beliefs that Hunt seems to hold about prospects and what they tell us, but here is the most important question of all that we can ask.

Are these beliefs leading us towards a more accurate map of the world and better predictive ability - or are they simply part of our identity that we cling to regardless of what feedback the world gives us?

If you were to define success as a measured calculation of whether a player becomes what the grading scale expresses he should be, Hunt’s rankings would come in much lower than many media pundits and the vast majority of teams. The feedback is relatively clear that smaller players from lower levels of competition do not have the same type of regular success that bigger players from bigger levels of competition have on the NFL field. So why would Hunt rank these players in this manner?

I think an excerpt from Tetlock’s book is applicable here, although it is worth reiterating that we are not exempt from these same types of belief perseverance implications either. You and I may not hold them as strongly as expressed in the rankings of players, but anyone with an identity holds certain things of themselves closely and it hurts when those things are pushed against by outside feedback.

“Commitment can come in many forms, but a useful way to think of it is to visualize the children’s game Jenga, which starts with building blocks stacked one on top of another to form a little tower. Players take turns removing building blocks until someone removes the block that topples the tower. Our beliefs about ourselves and the world are built on each other in a Jenga-like fashion.

My belief that Keynes said ‘When the facts change, I change my mind’ was a block sitting at the apex. It supported nothing else, so I could easily pick it up and toss it without disturbing other blocks. But when Jean-Pierre makes a forecast in his specialty, that block is lower in the structure, sitting next to a block of self-perception, near the tower’s core. So it’s a lot harder to pull that block out without upsetting other blocks—which makes Jean-Pierre reluctant to tamper with it.” - Philip Tetlock, Superforecasting

When Hunt says to a large audience on the Ross Tucker Podcast on multiple occasions that one of his “Emory-isms” is that size is not a skill, that becomes more and more ingrained in his belief system. It goes from a tool that helps to not overweight size in the evaluation process into a rule of thumb that squishes any objections about the importance of size on a field of big people who hit each other really hard over and over.

We would be less apt to shrug our shoulders and ignore it if it were expressed as two boxers competing at a 30 pound weight differential. The state athletic commission would likely not allow it for fear of severe injury to the smaller party. But as the complex opinions of “not overweighing size in the equation” get reduced to bumper sticker mottos we start to reduce their predictive value. And as they travel further and further down our Jenga tower of beliefs, we have a harder time pulling them out without toppling the whole tower, identity as an evaluator and all.

Willingness to Update

It is not a hopeless endeavor, however. Although feedback is inconsistent and often delayed (due to domain difficulty), we can still try to question our beliefs critically in hopes of updating them accordingly.

When a player who is very small fails, we have to ask why they failed rather than paint with a broad brush like “The league just doesn’t respect small guys.” or “He was doomed from the start, small guys get broken down in the NFL.”

When the small guy succeeds we have to do the opposite, asking why it happened rather than saying “Of course, the skill triumphs the size.” or “He is an outlier, and I will always continue to treat the other small guys the same assuming they won’t be.”

What we should not do is ingrain these beliefs into our identity and try to cement a perfect mental model of what makes success or failure happen at the NFL level. Conditions change, types of players go in and out of style as the rules and balance of the league shifts, and the process of building a team full of football players that can win games is always in flux.

The core of our Jenga tower should always be this - our beliefs on evaluating are whatever leads us to the best judgment of football players possible. No feedback can tumble our tower when we are resilient to any change in size, speed, positional fit, or any parameter that may shift. But to do so, we have to let go of our beliefs about what a good football player should be as treasures and treat them as hypotheses.

Resources

Thinking, Fast and Slow - Daniel Kahneman

The Scout Mindset - Julia Galef

Superforecasting - Philip Tetlock

The Success Equation - Michael Mauboussin

Emory Hunt’s Youtube Channel

The Venture Capital Secret - The Wall Street Journal

How Venture Capitalists Make Decisions - Harvard Business Review

FCS vs. FBS historical record