More on Clinton v. Obama
I received an interesting email yesterday from a reader named George, who wrote in response to my column on Gallup's analysis of the Democratic primary race. George had obviously read my column with great care and attention - and so I was struck when he wrote the following: "In my opinion there are some strong arguments for the notion that you appear to be underestimating Clinton's current advantage, that indeed Gallup is closer to the true picture on this."
I have been dissatisfied for a while about the end result of my writings on Clinton v. Obama. I have been concerned that I might be giving a wrong impression. It seems that I am. Smart readers like George seem to be inferring something about my position that I do intend.
I think that one of my problems has been that I have not approached the issue with the correct vocabulary. For instance, yesterday I led off my column with the following statement: "Regular readers of mine know that I am not at all inclined to write off Barack Obama. This is not to say that I think he is the likely nominee of the Democratic Party. My point has simply been that people are underestimating his chances." This is a true statement. However, it is not worded nearly as precisely as it should be. In particular, I meant the word "estimate" in a way that differed from the way I think that George meant it. This is, of course, my fault - and I intend to clarify my position today.
Here is how I would characterize my feeling about the Democratic primary. I am not in disagreement with pundits or analysts who argue that Clinton is expected to be the nominee. I would estimate that as well. But estimates such as these have two relevant features. There is the expected value. Namely, exactly what do you think will happen? That is your expected value. But there is also the variance - which is not really discussed much by pundits. The variance is a measure of your confidence that the real value will match your expected value.
The higher the variance, the less confident that you are the real value will match your expected value. Take a simple example. A person who says that the Bengals will score just six points against the Steelers next Sunday is expecting, obviously, just six points to be scored. But he also sees no potential for variation around this prediction. In other words, he sees no variance. On the other hand, a person who predicts that the Bengals will score between three and nine points against the Steelers is also predicting that six points will be scored. However, he sees some potential variation around this estimate. The expected values are the same. The difference is in the variances.
So - my point is that I agree with pundits on the expected value of the Democratic primary. I expect Clinton to win the nomination. The point where I disagree has to do with the variance. My argument is that it should be higher than pundits have been making it out to be. That was my intended point in yesterday's column. Gallup's arbitrary cutoff point obscured a problematic data point, 2003/04. It therefore made the variance seem lower than it actually is. Gallup was therefore committing Type I error. That is, they were identifying something as being true that might not be true.
Hopefully, you'll appreciate the conundrum that I have been facing in working through this issue in my own head. I have not seen anybody explicitly discuss the variance of their estimate. This is not something that pundits do. But the variance does get mentioned - but it is usually in a sideways fashion. You'll see it come through in the word choices that pundits use - specifically in the adverbs or adjectives. Is Barack Obama "trailing" or is he "trailing badly?" The latter is a statement that inclines one to lower the variance - i.e. not only is Obama trailing now, but this trailing makes it extremely unlikely that he will be able to catch up. Is Hillary Clinton "leading" or is she "unstoppable?" Again, the choice of word implies a different variance - the former allows for some variance, the latter obviously does not. This is what has sparked a kind of visceral reaction from me over the last weeks and months. I have been searching for the right way to articulate it - and until today, I have been left unsatisfied with my various attempts.
Let me just briefly review why I am inclined to a higher variance, even though I expect the same result:
(1) Public opinion is often susceptible to instability. On the national level, voter preferences for candidates have not really been formed by any kind of electoral campaign. Instead, they have been formed by the media dialogue on the campaign - of which average voters are only marginally aware. That is, most voters don't watch Meet the Press, the debates, read the Horse Race Blog ;-), etc. They only pick up the dialogue in dribs-and-drabs. They are capable of "regurgitating" this dialogue back to the press - thus giving the latter the impression that these positions are more well-formed and stable than they actually are. So, as events change, we might expect public opinion to change as well. Generally, we need to be wary of putting too much stock in the stability and foundation of public opinion. You can call this my "John Zaller Hang Up."(2) We saw something like this happen in 2004. A candidate who was at about 10% for many months suddenly and dramatically jumped to 50%. He went from fourth to first overnight. This is a sign that public opinion before the first primaries can indeed be susceptible to change.
(3) Even if public opinion is less susceptible to modification this time around - it seems sufficiently susceptible to alter the dynamics of the race. This is what yesterday's thought experiment was intended to demonstrate. That is, a win in Iowa and/or New Hampshire would probably not give Obama a 40% boost. However, it could give him a 20% boost, some of which would come at Clinton's expense.
(4) Obama is a good candidate and could very well win Iowa. He has lots of money, a strong organization, and a message that I think could sell. It will be interesting to me whether his media blitzes in New Hamphsire have any effect - because I think he could play there, too.
(5) We have very few previous observations to draw inferences about this year. It is a mathematical fact that as the number of observations decreases, the variance increases. The fact that we only have seven data points - all of which have great differences from one another - makes it more difficult for us to infer what will happen in January based upon what the October polls are saying.
So, the bottom line: there is a difference between what we believe will happen (expected value) and how confident we are that we will find what we believe will happen (variance). My disagreement is over the variance, not the expected value. Put another way, I do not disagree with the conclusion of pundits and analysts. My disagreement is more with the confidence with which they offer these conclusions.


