I keep promising to stop writing about lessons from the election that are applicable to markets, and then I keep finding more examples. So rather than make any promises I cannot keep, let’s just jump right into this.
Since Donald Trump’s surprise victory — though it wasn’t a surprise to those of you with the power of hindsight — there have been numerous after-the-fact explanations for why Trump beat Hillary Clinton. Many appear to be delightful exercises in data mining, the finding of “historical patterns that are driven by random, not real, relationships.” Add to this the assumption that these explanations are durable and will repeat in the future, and you have the makings of a terrible investment process.
Consider the various claims as to what the key to the election was:
Local health outcomes predict Trumpward swings (the Economist)
Education, not income, predicted who would vote for Trump (FiveThirtyEight)
Two economic variables perfectly predict election results (Statistical Ideas)
Clinton won 64 percent of America’s economic activity versus Trump’s 36 percent (Washington Post)
Clinton won the cities, Trump won the suburbs (New York Times)
None of these elements “predicted” anything. Each was the result of an analysis of what had already occurred. Post-election, data was sifted, a midpoint in each data set was located where a majority of Trump voters resided over Clinton voters, and a conclusion was reached.
This is classic data mining, and it should never be relied upon to make future forecasts.
Salil Mehta, former TARP director of analytics and author of “Statistics Topics,” has been critical of pollsters’ election forecasts. He spent much of the time before the election lecturing them that their models were underestimating the possibility of a Trump victory. In an email exchange, he observed:
There is an increased craving to slice and dice the recent election data, particularly given that the major pollsters have been shamed as they all immensely errored in projecting this year’s election’s victor. All gave President-elect Trump <15% a faux probability of winning. The risk of now retorting with data-mining this single election result is that they often miss an analysis of the predictive errors in this unique match-up (e.g., record high undecideds on Election eve), don’t take into account budding geospatial patterns to validate evidence, and in most case none of this should deceptively be promoted as an election forecasting model.
Correlations are very different from what is required to create a reliable model that correctly forecasts a future election or investing outcomes. Rather than mine data, Mehta suggests instead we engage in hypothesis testing.