NYT Sparks Controversy with Embrace of Non-Probability Surveys

New York Times office

If you’ve ever tried to pitch a non-probability PR survey to The New York Times, they’ve said no, saying that it is not representative. (Pro tip: Pitch op-ed columnists and bloggers, as Google Consumer Surveys did successfully with Paul Krugman.) Given this, many of us were excited to see the recent announcement that the reporters of the Grey Lady would use non-probability online surveys:

As the young voters who are less likely to respond to telephone surveys become an ever-greater share of the population over time, it is probably more important for analysts to have an ensemble of surveys using diverse sampling and weighting practices.

YouGov [an online research firm] has emerged as a part of that ensemble. It has tracked many of its respondents over months, if not years, which gives it additional variables, such as a panelist’s voting history, to try to correct for non-response. After the first 2012 debate, YouGov showed less of a swing than many other polls, and its final pre-election polls were as good as or better than many other surveys in forecasting the results.

There are still questions about the effectiveness of web panels, which can reach only the 81 percent of Americans who use the Internet [Pew says 89% are on the Internet]. That’s worse than the 98 percent of households that can be reached by a live interview telephone survey, although it’s better than the 63.5 percent of Americans who have a landline telephone and can therefore be contacted by automated polling firms, which are prohibited by federal regulations from calling people on their cellphones….

All of this is controversial among survey methodologists, who are vigorously debating whether a non-probability web panel should be used for survey research. [emphasis added] At the same time, they’re also debating whether the sharp rise in non-response is undermining the advantages of probability sampling. Only 9 percent of sampled households responded to traditional telephone polls in 2012, down from 21 percent in 2006 and 36 percent in 1997, according to the Pew Research Center.

While the methodology debate rages, it’s probably best to have an eye on a diverse suite of surveys employing diverse methodologies, with the knowledge that none are perfect in an increasingly challenging era for public-opinion research.

As a guardian of research standards, AAPOR, the American Association for Public Opinion Research, was not happy:

Recent actions involving the removal of meaningful standards of publication for polling data by at least one leading media outlet and the publication of polling stories using opt-in Internet survey data have raised concerns among many in the field that we are witnessing some of the potential dangers of rushing to embrace new approaches without an adequate understanding of the limits of these nascent methodologies. The American Association for Public Opinion Research (AAPOR), the leading association of public opinion and survey researchers, recognizes that the ways in which public opinion is formed, expressed, conceptualized and measured continue to grow and change. We embrace rigorous empirical testing of these new approaches and methodologies and encourage assessments of their viability for measurement and insight. It is essential, however, that the use of any new methods be conducted within a strong framework of transparency, full disclosure and explicit standards. The absence of these foundational necessities removes the ability for any of us to understand the quality and validity of the information reported.

Pew Research was only slightly less cautious, in a Q&A that Drew Desilver conducted with Scott Keeter, director of survey research:

If this proves to be a successful endeavor for the Times and CBS News, does that mean other pollsters will embrace non-probability sampling?

Not necessarily. It’s important to keep in mind that online non-probability panels vary in quality, just as probability-based surveys do. One of the most important points in the AAPOR Task Force report is that there’s no single consensus method for conducting “non-probability sampling.” There are many different approaches, and most of them don’t have the public record of performance that YouGov has. YouGov has been conducting public polls in elections for many years. As a result, they have a track record that can be compared with probability-based polls. Until we have more organizations conducting polls in advance of elections and explaining their methods in detail, I believe that adoption of non-probability sampling for political polling will proceed slowly.

Isn’t Pew Research using an online panel right now?

We do have a panel – it’s called “The American Trends Panel” — but it’s very different from the one that the Times and CBS are using. It’s based on a probability sample, and while most of the interviews are conducted online, we also have panelists who don’t use the internet. We interview those individuals by mail or phone. Here’s a link to more detail.

Is Pew Research ever going to use the kind of online non-probability panel that the Times and CBS are using?

Yes, we will – but the real question is what we will use it for. Our current standards permit the use of non-probability samples for certain purposes, such as conducting experiments or doing in-depth interviews. In addition, we have embarked on a program of research to help us better understand the conditions under which non-probability samples can provide scientifically valid data. We also are exploring how to utilize non-survey data sources, which by their very nature tend to come from “samples” that are not random. But until we understand the pros and cons of those methods a lot better, we’re going to be very cautious about incorporating them into our research.

On the other side, Andrew Gelman and David Rothschild were extremely unhappy with AAPOR. Writing in The Washington Post, they said:

We work with YouGov on multiple projects, so we hardly claim to be disinterested observers, but our experiences in this area make us realize how ridiculous the AAPOR claims are. Many in the polling community, including ourselves, have the exact opposite concern that, with some notable exceptions such as The New York Times, leading organizations are maintaining rigid faith in technology and theories or “standards” determined in the 1930s. We worry that this traditionalism is holding back our understanding of public opinion, not only endangering our ability to innovate but putting the industry and research at risk of being unprepared for the end of landline phones and other changes to existing “standards.”…

In practice, the probability pollster needs to make massive and changing assumptions about the method of reaching people, as the breakdown of landline-only, cellphone/landline and cellphone-only households switches. There is no known ground truth to how people can be reached and the quantity of people at each phone. As the response rates fall below 10 percent, polls need to make more decisions about how to adjust for systematic differences between respondents and the general population. And pollsters continue to make decisions about their models for the likelihood that a respondent will turn out to vote.

In short, probability pollsters need to make many assumption selections into their polls, just as YouGov does! An important difference is that while YouGov examines their selection issues aggressively and publicly, probability pollsters sometimes ignore the growing lists of selection issues they face. While academics and practitioners alike have studied the issue, traditional probability polling still reports a margin of error that is based on the assumption of 100 percent response rates for a random and representative sample of the population. AAPOR writes of non-probability polling: “In general, these methods have little grounding in theory and the results can vary widely based on the particular method used.” In fact, the theory used by YouGov and in other non-probability polling contexts is well-founded and publicly disclosed, based on the general principles of adjusting for known differences between sample and population.

Reg Baker also believes that AAPOR gets it wrong:

I don’t deny there is an alarming amount of online research that is just plain bad (sampling being only part of the problem) and should never be published or taken seriously. But, as the AAPOR Task Force on Non-Probability Sampling (which I co-chaired) points out, there are a variety of sampling methods being used, some are much better than others, and those that rely on complex sample matching algorithms (such as that used by YouGov) are especially promising. The details of YouGov’s methodology have been widely shared, including at AAPOR conferences and in peer-reviewed journals. This is not a black box.

On the issue of transparency, AAPOR’s critique of the Times is both justified and ironic. The Times surely must have realized just how big a stir their decision would create. Yet they have done an exceptionally poor job of describing it and disclosing the details of the methodologies they are now willing to accept and the specific information they will routinely publish about them. Shame on them.

But there also is an irony in AAPOR taking them to task on this. Despite the Association’s longstanding adherence to transparency as a core value they have yet to articulate a full set of standards for reporting on results from online research, a methodology is that is now almost two decades old and increasingly the first choice of researchers worldwide.

…I have long felt that AAPOR, which includes among its members many of the finest survey methodologists in the world, would take a leadership role here and do what it can do better than any other association on the planet: focus on adding much needed rigor to online research. But, to use a political metaphor, AAPOR has positioned itself on the wrong side of history.

The gold standard is probability surveys. But when it is time to leave the gold standard, sample matching from firms like YouGov and Toluna floats to the top, as currently the most representative non-probability method of understanding general populations. But expect to see Google Consumer Surveys, RIWI, and others continue to improve their own innovative approaches to non-probability representativity.

Yes, AAPOR is on the wrong side of history.

Jeffrey Henning, PRC, is president of Researchscape International, which provides “Do It For You” custom surveys at Do It Yourself prices.  He is a Director at Large on the Marketing Research Association’s Board of Directors. You can follow him on Twitter @jhenning.


Speak Your Mind