About this Blog
About The Author
Email Me

RealClearPolitics HorseRaceBlog

By Jay Cost

« The Dynamics of the GOP Race in Iowa | HorseRaceBlog Home Page | Debate Predictions »

On Penn's Latest Strategy Memo

Hillary Clinton's chief pollster, Mark Penn, offered another strategy memo this week. This time, he discussed the polls - and suggested a way to interpret them. Today, I would like to respond to this memo.

Let's take it point-for-point.

What's happening in the Democratic primary for president?

A lot less than the headlines would suggest.

Iowa continues to be a competitive race while Hillary is maintaining meaningful leads in all the other states and in the national polls that are representative of her Feb 5th strength.

But with the plethora of polls it is becoming increasingly difficult to follow what is a trend, what is a poll without a trend, what is a screened phone poll and what is a computer driven poll. The natural tendency is for those polls that show it closer to get more attention. They are "news."

The big paragraph at the end was what got my attention, and actually induced me to pen a response. Here, Penn mixes a controversial statement with a non-controversial statement - suggesting that the latter implies the former.

I think we would all agree that the media focuses on the polls that show a closer race. However, the first sentence of the paragraph - when we combine it with the penultimate paragraph - is actually quite controversial. It seems to me that Penn is suggesting that if we select our polls judiciously, rather than doing what the headline-hungry press does, we will see that Clinton has a "meaningful" lead in "all the other states" beyond Iowa.

Ironically, the word "meaningful" is, at least for the moment, a meaningless term. Does he mean that her lead is statistically significant? Does he mean that the lead is not just significant, but durable? We cannot yet tell.

Let's see where he wants to take this.

There's yet a new case-in-point of poll confusion today with the release of a slew of Mason-Dixon polls - but a look at their past polls paints a very different story than at first glance. For example, they have Hillary ahead by 3 points today in SC and pundits suggest that this shows how the race has closed. But while other polls showed a strong lead in June, the Mason-Dixon poll had Hillary losing by 9 points in June, so this actually shows Hillary's margin up by 12 points from their last poll and surging. When you look at the facts by tracking results over time from the same poll, she is up, not down. Other polls give her a much wider lead than Mason-Dixon: the latest Pew poll has Hillary ahead by 14 points in South Carolina and the latest ARG poll has her 24 points ahead.

Because predicting primaries is extremely difficult and everyone has their own methodologies, you have to look at polls from the same pollsters to see if there have been changes.

Similarly, the Mason Dixon poll in NH shows a close race with a 3 point lead for Hillary - but their last poll in June gave her a 5 point lead - and a WMUR/CNN poll around the same time had Hillary leading in New Hampshire by 15 points. So Mason-Dixon was low in June and they actually show no statistically significant change in her margin now.

Again, Penn mixes controversial with uncontroversial statements - hoping that the latter validate the former. Here, he suggests that we examine the trend in a given poll to get a sense of the race. This is a valid idea - and it is why Tom, Blake, and Reid are always careful to mention trends when they report new polls on the RCP Blog. However, if we do what he suggests - "look at polls from the same pollsters to see if there have been changes" - we will not find what he asserts - "she is up, not down." At least not if we do it in the correct way.

Penn wants to select June as his baseline. This is not a viable selection - for two reasons. First, our interest is, as he claims, to see if there have been "changes." If we want to identify changes, we should actually look at the polls that immediately preceded the most recent batch. Second, the race was much tighter in June than it has been up until recently. The RCP average of national polls showed Clinton with a lead of less than 10% on Obama as of June 10. I think we would all agree that Team Obama would love it if the race right now resembled the race in June!

So - what we need to do is compare the latest polls to their immediate predecessors. Let's do New Hampshire first.

Zogby had Clinton up by 15 in September. Now it has her up by 11.

Marist had Clinton up by 22 in November and 22 in October. Now it has her up by 13.

Rasmussen had Clinton up by 10 in November and 16 on October 23. Now it has her up by 7.

ARG had Clinton up by 18 in October and 19 in September. It had her up by 11 in November.

CNN had Clinton up by 23 in September. It had her up by 14 in November.

What about South Carolina? It is more difficult to track trends there because fewer polls have been taken - and polling companies are surveying with less frequency. Nevertheless, there are three companies that have polled multiple times in the past few months: Insider Advantage, Rasmussen, and ARG. Insider Advantage and Rasmussen show Clinton's lead shrinking. ARG shows it increasing.

Penn only mentions the ARG poll.

Let's continue with the memo.

2nd case in point: Last week three polls of New Hampshire Democratic primary voters were released. One showed Hillary with a 14 point lead (Marist), one showed Hillary with an 11 point lead (Zogby) and one gave Hillary a 6 point lead (ABC/Washington Post). Which poll got the most attention? The one that showed the closest race - ABC/Washington Post. And poll junkies should also note that New Hampshire polling is particularly difficult because it is often unclear to the last minute which independent voters are coming to which primary - and Hillary has a strong and energized lead with Democrats.

I think Penn gives us a clue here by what he means with the word "meaningful." We learned that Hillary has a "meaningful" lead. Now we learn that Hillary has a "strong and energized" lead. It seems to me that he is not hinting at simple statistical significance. He is hinting at durability. That is - Clinton's lead now can be expected to endure to Election Day.

I would ask how he could possibly know that from the methodology he offers. The only suggestion that he has made so far is that we watch the changes in a given poll. But we will only wind up with the conclusion of meaningful, strong, and energized if we look at certain polls (ARG) and ignore others (Mason-Dixon, Rasmussen).

How do we evaluate his claim?

Penn noted above that each poll has its own particular methodology. This makes it difficult to evaluate Penn's argument - or any assertion about the state of the race. Pollsters are not like academics, who often spend pages and pages describing and defending their methodologies before they actually get to their results. Polling methodologies are treated as proprietary interests, and kept close to the vest. ARG, for instance, gives us almost no hint of how it goes about conducting its polls. Thus, all that we have are these bare numbers, with no indication of how they were created.

Which poll should we select?

Here's what I do. Let's take New Hampshire as an explicative case.

Suppose that we deem five polls to be current: Mason-Dixon, Zogby, ABC News/WaPo, Marist, and Rasmussen. Let's also suppose that we have no a priori idea which poll's methodology is the correct indicator of the preferences of the January 8 electorate (but we assume that one is indeed reasonably unbiased).

So:

If we guessed that Mason-Dixon has the correct methodology, we would expect Clinton to be at 30%.

If we guessed that Zogby has it, we would expect her to be at 32%.

If we guessed that ABC News/WaPo has it, we would expect her to be at 35%.

If we guessed that Marist has it, swe would expect her to be at 37%.

If we guessed that Rasmussen has it, we would expect her to be at 33%.

We have five numbers, each of which comes from a different methodology. We do not know which comes from the proper method - but we do know that each has a 1 in 5, or 20%, chance of being derived from the right method.

What should we do?

We should work to minimize our expected error. What number could we select that minimizes the likelihood that we have chosen incorrectly? Penn wants you to take the 37% and ignore the 30%. That is not the way to minimize your expected error. One way to minimize your expected error is by calculating the statistical average, which is computed as:

(30 X .20) + (32 X .20) + (35 X .20) + (37 X .20) + (33 X .20) = 33.4

[Note: We would get the same estimate regardless of how many polls we believe have accurate methods - so long as we assume that there is at least one.]

This number also happens to be the current RCP average of Clinton's position.

A-ha! Now we have a way to get not just a sense of the trends within a poll, but we also get a sense of what all the polls are pointing to. It is not necessarily an unbiased estimate. After all, there may be a poll in there with an absolutely lousy methodology. If there is, our estimate is indeed biased. However, our average minimizes this bias in light of the fact that we have no a priori idea which poll (if any) has lousy methodology.

I cannot emphasize enough that this solution is not ideal. It is, rather, the most practical response to the choice of polling companies not to publish their methodologies in any great detail, and therefore to thwart a robust debate among experts as to which is superior. If pollsters had the same reporting protocols as, say, the American Political Science Review, we could improve upon this average by evaluating each poll's methodology - and selecting those whose methodologies are most sound. But, because we are not professional pollsters and cannot go "behind the veil," this is the best we can do.

So, we compute the statistical average. This is the best response to the polling environment. We can compute this average over time, graph it, and get a quick visual idea of where the race stands from this "minimize expected error" perspective.

You probably have seen such graphs before:

rcp-dem-nh.gif

So, contrary to Penn's claims - we see that the race has tightened in New Hampshire. If we graph the other contests, we would see that most (but not all) show tightening of various degrees.

And therefore we see that Penn's initial uncontroversial statement about the media's reporting of polls does not validate his controversial statement that Hillary's lead is meaningful, strong, and energized. The latter conclusion comes only when one selects the polls that show Hillary at her strongest. This is not the correct way to get a sense of the race.

-Jay Cost