Unpredictable Iowa

Two items crossed my path yesterday that underlined the unpredictability of the Iowa Democratic caucus. The first comes from Ana Marie Cox of Time:

The Edwards campaign is alive and well in Iowa. Privately, rival campaigns concede that Edwards would probably win if the caucuses were held, say, tonight. Says one organizer, "His supporters are largely previous caucus-goers; you don't have to convince them very hard to go again. Everyone else is going to need all the convincing we can manage in the next month and a half."

The second comes from Richard Wolfe of Newsweek:

Is John Edwards in trouble in Iowa? Peg Dunbar thinks so. She signed up as a county chair for Edwards in the northeastern town of Waverly earlier this year, after backing the former senator's campaign in 2004. Now she has changed her mind and switched to Hillary Clinton. "John Edwards has been in Iowa for four and a half years and he's in third place," she says. "He should be in first place. Granted, it's very, very close. But I don't see him going anywhere and I don't go with a loser."

Dunbar is one of four county chairs--essential figures in any Iowa campaign--who have backed out since being identified as Edwards chairs in a June press release. Ernie Schiller of Lee County says he's now undecided, Frank Best of Louisa County has switched to Obama and Jody Ewing is supporting Bill Richardson.

So, what is going on in Iowa? How can rival campaigns see Edwards winning as of today, but Richard Wolfe see him losing?

I think the problem is that we do not have a reliable metric to measure the state of the race. Polls are of limited utility for gauging Iowa Democrats. This is a subject I discussed earlier in the year. There are two problems. The first is devising a sample of voters. Turnout in the Iowa caucuses is difficult to measure because it takes a good degree of devotion to participate. Chris Cillizza discussed this last week, writing:

Figuring out who is going to vote is always the most basic challenge for any pollster. Past results provide a guide but can never be taken as foolproof as turnout dynamics change from election to election.

This is especially true in Iowa's caucuses where an extremely small number of registered voters turn out to participate, voters can register the day of the caucus and turnout patterns fluctuate widely from caucus to caucus.

The challenge that anyone polling Iowa must face then is how to select an accurate sample of voters. Do you use the list of registered voters as your baseline? Or do you use the far narrower caucus list, which lists those that have participated in the most recent caucuses, to create your sample?

There is a second problem that is not discussed as much. A poll of Iowa Democratic caucus goers does not really mimic the process in which they participate. In a general election - you go into a voting booth, select your first choice, leave the booth, and drop the ballot in the box. And so, a poll that asks you for your first choice and then moves on to other questions does a reasonably good job of mimicking the act of voting.

However, this is not the experience of Democratic caucus goers. Iowa Democrats begin by standing in an area designated for their first choice candidate. Then, for thirty minutes, they either persuade or are persuaded by others to switch their choices. At the end of the half hour, electioneering is halted and caucus officials count the number of supporters that each candidate has. Candidates who have less than 15% or 25% are deemed not to be viable. And so, another thirty minutes for electioneering is once again granted. The supporters of nonviable candidates must find new candidates to support, team up with supporters of other nonviable candidates to make their candidate viable, or abstain.

As far as I know, there is no poll that fully reflects this process, which speaks to two very unique elements of voter psychology:

(a) How strong is your preference for your candidate? And we are not just talking about claims of strength, which some polls measure. We are talking about whether you can withstand being cajoled for half an hour. It is one thing to claim that your support is strong. It is another thing to endure thirty minutes of the Iowa caucus. This is also where organization is extremely important. To pry wavering voters from another candidate - a candidate needs organizers who are better at politicking than the other guy's organizers.

(b) If your candidate is not viable, who's your second choice among the viable candidates? For voters who support a viable candidate - second choices only become relevant if they change their primary candidate (either before or at the caucus). For voters who support nonviable candidates, second choices are quite important. Those whose candidates are deemed nonviable might add up to a substantive number. In the University of Iowa Hawkeye Poll taken last month - about 16% of Democratic respondents claimed support for a candidate who would be deemed nonviable at a caucus meeting that mimicked the statewide numbers. What is tricky here is that most polls I have seen do not tell us who the supporters of nonviable candidates support secondarily. Zogby does ask for the second choices of respondents supporting nonviable candidates - but, importantly, we are dealing with a very small subsample of such voters. The supporters of nonviable candidates would only number about 75 people per sample. This makes it very hard to estimate with any statistical confidence whether any candidate is receiving a boost because of them. [Also, Zogby might run afoul of the ecological fallacy. Just because Richardson, for instance, does not reach 15% on a statewide level does not mean that all of his supporters have to be reallocated. He could be above 15% in a given precinct - and therefore those supporters would not have to be reallocated.]

Iowa is one of the reasons I was arguing that Hillary Clinton is not inevitable (before it was popular to do so!). Polling methods do such a poor job of mimicking the caucus process that I think we have to give a wide berth to each candidate's number. The RCP Average, as of today, has Clinton at 30%, Obama at 23.6%, and Edwards at 19.6%. But the bias that the caucus system might induce in the polling is such that I do not think any statements can be made about who is actually in the lead. At best, they just give us some basic purchase on who might win.

Note that this is not a matter of the margin of error. Adding +/-3% to each candidate's total is not necessarily going to solve this problem. The margin of error is a matter of efficiency. What we are discussing here is bias, or the failure of our polling data to approach the level of support that a candidate actually enjoys. Assume no margin of error, and we might still have to "adjust" the poll results by some unknown factor. That last phrase is the trickiest part. Bias is very difficult to measure beforehand. The unknown factor may be very large. It may be zero. We just do not know.

The word "bias" actually has a technical connotation that I am relying on here. It implies that the polls are systematically overestimating or underestimating a candidate's level of support. In such a situation, it is not that our polling data reflects the random variation we see any time we try to measure a large population via a small sample. That's what the margin of error accounts for. That is a matter of efficiency, not bias. Those sorts of variations are expected to cancel each other out - and therefore, on average, our samples will correctly measure the population. With bias, the variations do not cancel each other out. Instead, they reinforce one another - and so, on average, we are left with a difference between our sample and the population. Thus, instead of being inefficient - we are just plain wrong.

Each of the factors I mentioned above could bias the polls.

First, it might be the case that the respondents who will go to the caucus have systematically different preferences than the respondents who will not go to the caucus but who are not filtered out by the pollsters' likely voter screening processes. On the flip side, it might be the case that the voters who are being filtered out by the pollsters' screening process are indeed going to attend the caucus, and their preferences diverge systematically from the voters not filtered out. Either scenario is intuitively plausible - as it is plausible that there is a correlation between candidate support and the enthusiasm measured by likely voter screens. Cox's essay was hinting at the former scenario: Edwards' supporters are more certain to go to the caucus. Therefore, the actual set of caucus goers are more pro-Edwards than the polls indicate. If this were the case, the polls would systematically underestimate the strength of Edwards' support.

Second, it might be the case that there are systematic differences in strength of support per candidate - and therefore systematic differences in levels of support before and after the time allotted for politicking. For instance, Obama's supporters are younger than the average caucus goer. And younger supporters may be more susceptible to persuasion at the caucus. If this is the case, then the polls are systematically overestimating Obama's level of support because they fail to mimic the process those supporters will undergo on caucus night. On the flip side, perhaps Obama's relatively strong organization means that it is better able to peel away supporters of other candidates, even the viable ones. In this case, the polls would systematically underestimate Obama's level of support.

Third, it might be the case that there is a systematic difference between the first choices of all and the second choices of those primarily supporting nonviable candidates. Perhaps Biden's supporters tend to support Obama secondarily at a rate that is greater than the general population's level of primary support. If 24% of the whole public supports Obama first, maybe 40% of Biden supporters support Obama second. As Biden is deemed nonviable - those supporters have to pick another candidate. The former Biden supporters go more to Obama than the general public did (40% instead of 24%) - and thus Obama's position relative to Clinton and Edwards will be systematically better than what the polls are telling us.

Will any of these biases come about? Not necessarily. That is the troublesome feature of bias. We have to know the population values before we can know whether a polling sample is biased. We do know that there are differences between the ways the polls are conducted and the way the Iowa caucus is conducted. Maybe those differences systematically favor one candidate over another. Maybe not. We will not know until after the caucus.

And so, we are left with what I think is a relatively unpredictable event. We can be sure that Clinton, Obama, or Edwards will win the Iowa caucus. And maybe if the polls come to show a large and consistent break toward one candidate or the other - we might be able to say more than this. But, at this point, with the differences between these three candidates in the RCP average being only about 10%, I do not think we can say more than the victor in Iowa will be one of these three.

-Jay Cost