About this Blog
About The Author
Email Me

RealClearPolitics HorseRaceBlog

By Jay Cost

« Why Did Clinton Overlook Obama? | HorseRaceBlog Home Page | Monitoring the Media »

On the ARG Poll

Anybody who checked Drudge today will have seen that there is a "shock poll" that puts Hillary Clinton 15 points in front of Barack Obama in Iowa. The polling company that produced the poll is ARG, and this is what it had to say about its results:

Hillary Clinton leads Barack Obama among women 38% to 21%, which is unchanged from a week ago (Clinton 36%, Obama 23% among women). Obama has lost ground among men to John Edwards and Clinton. Among men, Clinton is at 28%, Edwards is at 27%, Obama is at 16%, and Joe Biden is at 11%. A week ago, Obama was at 27% among men, followed by 21% for Clinton and 19% for Edwards.

This poll might indeed be a trend - the first sign of a swing back to Clinton among Iowa Democrats. Unfortunately, we will not be able to know for a few days - as polling companies presumably suspended operations over Christmas. I have a few caveats that I would put in place on this poll - just a few basic warnings about why we should not over-interpret these results.


1. ARG polled the weekend before Christmas, from 12/20 to 12/23. This might not be the best time to construct a sample of likely Iowa voters. No other poll I know of has come out with a sample taken from those days. This is a sign that most other pollsters were wary about Christmas weekend.

2. The ARG poll has Clinton up and Obama down by statistically significant amounts relatively to its last poll (12/16 to 12/20). On the Republican side, it has Mike Huckabee down and Ron Paul up by statistically significant amounts. This is a lot of movement - four candidates made statistically significant moves in the course of three days. Recall the last point, and note that these are three days when respondents probably were not thinking much about politics. December 20th to the 23rd are days usually filled-to-the-gills with last-minute holiday preparations. They are not great days for reflecting on the state of the presidential campaign. Thus, this movement might be due to the sampling effects mentioned in Point 1.

3. There are other elements of the poll that just don't scan with me. For instance, it shows Fred Thompson at 3% and Alan Keyes and Duncan Hunter both at 2%. ARG has shown Thompson low over the last few weeks - so this would not be a consequence of ARG's internal sampling method thrown off by the holiday weekend. But its last two samples estimated Thompson's support well below the rest of the Iowa polls. And 3% just does not pass the "smell test."

4. Mark Blumenthal has noted several interesting facts about ARG. First, they sample more heavily than any poll from first time Iowa caucus goers (on the Democratic side). This is probably why they usually have Edwards below where he is in the RCP Iowa Democratic average. Edwards is doing relatively well among previous caucus goers, but ARG is "diluting" their influence with the first-timers. Now, ARG's intuitions about first time caucus goers may be correct, but they are on the margins on this issue. Second, they did an extremely poor job of reporting their sampling methodology when Blumenthal requested it. They would not provide any information about respondent demographics, and they would not provide information about the number of long-time caucus goers in their Republican samples.

5. Just because differences between polls are statistically significant does not mean that they are necessarily caused by changes in the population. Clinton, Obama, Huckabee, and Paul have all made statistically significant moves - but some of these could still be statistical blips induced by the sample. This is as good a time as any to review exactly what statistical significance is.

The technical language that describes the margin of error usually reads something like this: "We are 95% confident that the true values are +/- 3%."

This is referring to Type I error, or the error of the false positive. It means that 95% of the time, when you take a poll and get 17%, the real world value will be between 14% and 20%. This also means that 5% of the time (or one time out of 20), it will be outside this range. This is the poll's tolerance of Type I error. The chances are 5% that you will have a false positive - you will believe that the real value is between 14 and 20 when in fact it is not.

But suppose you have 20 different statistics you are looking at. What are the chances that the real world value of at least one of them will be outside the margin of error simply due to sample effects? It is 64%!

This is something that is rarely noted when looking at poll trends - it is called the experiment-wise Type I error rate. When you look at the polls to divine trends, you are implicitly doing some form of statistical hypothesis testing. You are trying to determine whether changes are due to sampling error, or whether they are due to shifts in the population. To do this, you have to assume that sampling error will only explain so much variation in the polls. Usually (95% of the time), this assumption holds up. Occasionally (5% of the time), it does not. When it does not hold, you have committed Type I error. And the more polls you look at, the more likely it is that you have committed it.


Casual readers, please take note: I am not predicting that this is a blip. Contrary to what some have assumed, I do not make predictions about the ways the polls will move. That is a fool's errand. My point here is simply that it is possible that this movement is induced by sampling effects - and we should be careful not to over-interpret these results.

-Jay Cost