I’m a words guy, and respect numbers. Numbers have weight. Numbers help us make decisions. Numbers can help us distinguish signal and noise. Yet, like words, numbers can be selectively presented, carefully constructed, to reinforce narratives.
I write as one who has been fooled by numbers and narratives many times, going back to the ecocatastrophe predictions of the early 1970s. I was caught up in some numbers recently, convinced and confident, only to choke down my error in personal disgust. (Too embarrassing to tell now, give me 15 years.)
I encourage you to be discerning about numbers that are presented to you. Here are some tips.
An average is less informative than a distribution and a trend. How was the number determined, and by whom? What are the subsets that make up a single number? Let’s consider an example.
Every month the US government gives a report about the number of new jobs. It’s encouraging to see that people are finding employment. Yet we need to understand the overall number by breaking it down, and understanding the process.
My understanding – correct me if I’m wrong:
• The number does not distinguish between part-time and full-time-with-benefits jobs.
• The number does not call out second (and third) jobs people are taking on to earn more income.
• The number is revised (almost always downward) within 2-3 months, because of the crude way the information is collected via multiple surveys.
• There is a breakout by industry/category. In 2023 most of the new jobs created were in government, leisure, and healthcare. Manufacturing is flat-to-down. Info-tech is down.
Another example, common in a political framework: Polling numbers for a president, governor, or legislator. I don’t usually see survey questions reported with the results (how convenient), even though we know that how a question is asked shapes the responses. I recently saw that less than 1/3 of those polled are satisfied with how the current US government leadership is responding to climate change. The 2/3rds who are unsatisfied are surely a mix of people – some thinking it’s hogwash, and some demanding much more be done.
Be mindful that how numbers are calculated changes over time. The Consumer Price Index, Gross Domestic Product, Unemployment, Inflation Rate… all these are composite numbers. The method of determining each of them has shifted over the years. For example, the CPI used to include a specific basket of grocery items. That basket has changed, and fuel is no longer included in the same way. Sometimes these representative numbers were adjusted for sincere reasons, and occasionally for political convenience. The net effect, however, is that you must be suspicious of charts of these numbers over years of time.
Charts are useful and still merit caution. Watch out for graphs that don’t have a zero on the axis, or no numbers at all. Check the start and stop dates on trend line graphs. (Good example: Nearly all the US temperature graphs you’ll see begin in the late 1970s, which were the coldest years on record in the northern hemisphere in the 20th century, rather than the 1930’s, which were the hottest years.) Think carefully about correlations because they may not represent causation.
Don’t be shocked by unequal distributions. “80/20” is not a physical law but unequal distributions are common (70/30, 95/5, 99/1). Unequal distributions do not automatically mean something is wrong or unfair.
Low probability events will happen. Streaks of repeats occur in random sequences. Be wary of assigning blame or consequence to these.
An average from larger sample is more likely to be correct than an average of a smaller sample. But larger data sets will inherently have more false positives and false negatives based on how measurements are done. There is simply more noise in a larger data set, which means it’s easier to find “what you want” in messy data.
Innumeracy (the numbers equivalent of illiteracy) is a significant problem. I cringe when I hear an activist say “It’s outrageous that 25% of the students are the bottom quartile!” Median and average are different. Percent rate changes and actual prices are different.
Our general psychology also makes us vulnerable, even when we’re well-educated. Horoscopes and fortune-telling are a perpetual business because we’re susceptible to cleverness. We all tend to relax our guard when the source is comfortable and familiar. We’re Captain Skeptical when “those” people give a number, and Lieutenant Lax when one of “our” side present numbers. We assign conspiratorial intent to a decision based on tradeoffs. We never question some statistics and automatically dismiss others. We should be equally careful.
We over-weight two predictions made by a psychic or an economist and ignore the 98 times they were wrong. Stock market bears will eventually be right. Occasionally some fragment of a dream appears like a prediction in retrospect. Critics and doomsayers sound smart, and so do market bulls. Someone pointed out that 85% of economists expected a serious general recession in the US in 2023. (I think some industries and sub-markets did have a recession, but not uniformly; the company I worked for did very well in 2008-2009, despite the subprime mortgage crisis.) Many expect a recession in 2024. Eventually some will be right!
Numbers can be great friends and tools, or weapons. Use them well.