How to Combine Experts' Predictions? It Depends on Whether it's Numbers or Words

Related articles

What do you do when you have two expert opinions, one saying the probability is 60%, the other expert says its 80%? Even more perplexing, what if one opinion is likely and the other very likely? A new study reports that we handle those informational phrases differently.

Opinions expressed in numbers seem more concrete, precise, providing a sense of direction – after all, 70, 80, or 90% is a lot closer to 100 than it is to zero. Nate Silver and his FiveThirtyeight website came to prominence in the 2008 election by aggregating polling data and being a better predictor than any of the polls considered by themselves. For the most part, studies have shown that when we are given numeric information, we tend to take the average, makes arithmetical sense at least.

But what about that verbal description. For many of us, challenged by math, a probability expressed in words, such as likely or almost certain, seem to work better. But words are more ambiguous than numbers. The classic tale is that of newly elected President Kennedy faced with approving the Bay of Pigs Invasion. His intelligence community indicated that the chance of success was likely, but numerically that translated into a 30% likelihood of success. I find those verbal and numeric expressions significantly different. [1] Word expressions of probability are more ambiguous and fail to convey the same directionality as a number; how close to certain is likely? 

The study made use of on-line participants in Amazon’s Mechanical Turk, presenting them with various scenarios and providing them with two expert opinions. In one case, the opinions were expressed as percentages, 60 to 69%; in the other instance, as likely. And the participants saw that likely was on a line of certainty that would occupy that 60-69% region, more positive than a coin flip. They then asked the participants for their opinion as to the likelihood of the event. 

As had been previously reported, numeric predictions were averaged. In this instance, most participant’s predictions were close to that of the experts, 60 to 69%. But when participants were given a prediction of likely by each expert, the participants’ predictions became more extreme, they were additive, they were counted.

“60% and 60% is 60%, while likely and likely is very likely.”

It made no difference if the experts’ predictions was below the midpoint, at 40%, people still averaged. For a comparable word expression of 40%, the participants became more extreme, lowering their belief in the prediction more than that of the experts. 

“People average numbers and count verbal [predictions] creating more extreme forecasts than experts whether for probabilities above and below midpoint, for hypothetical and real, and when presented simultaneously or sequentially.”

The researchers thought this might reflect some type of intuitive Bayesian analysis where we upgrade our prediction based upon new information. Indicating to the participants that the information was from a new or different source had no effect, numbers were averaged, words counted - we are not natural Bayesian analysts. 

“…people are more likely to average numeric probability forecasts and more likely to count verbal probability forecasts. As a result, participants’ own forecasts become more extreme (i.e., closer to certainty) as they see additional verbal forecasts from advisors, but closer to average of the advisor’s forecasts as they see additional numeric forecasts. [2]”

This cognitive hiccup or behavior has consequences in today’s media world. In the echo chambers on either side of the aisle, predictions of future events are endlessly repeated, and for the most part, assigned word expressions. This study suggests, that with repetition, and over time, we move further and further to believing the certainty of the predictions even though we have no greater evidence. It is not the sole cause, but it does seem to be part of why contentious scientific debate becomes more strident and extreme. 


[1] Evidently, I am not alone in this conundrum. “Within the past decade, both the Intergovernmental Panel on Climate Change … and the United States Director of National Intelligence issued official guidance on how verbal probabilities must be used in their reports.

[2] Researchers had some additional data suggesting that the effect increased with the number of advisors to about five, as which time the effect plateaued.  


Source: Combining Probability Forecasts: 60% and 60% is 60%, but Likely and Likely is Very Likely Social Science Research Network DOI: 10.2139/ssrn.3454796