Aarti Iyer
Science
Student Council, Social Psychology Representative, UC–Santa Cruz
Thomas K. Burdenski, Jr.
Science Student Council, Chair and Quantitative Psychology Representative,
Texas A&M
(This article was first published in the Winter 2002 issue
of the APAGS Newsletter.)
In the wake of the September 11th
terrorist attacks, public opinion polls abound proclaiming the percentage of
Americans who support military action in Afghanistan and the percentage of
Americans who do not. Do these polls accurately reflect the way Americans think
about the war? Are the results generalizable to the larger population? Often,
the sampling strategies used by pollsters suggest that the answer to these
questions is "no."
The results of these opinion polls can play an important role in influencing
government policy and action. It is thus very important for social
scientists-including psychologists-to pay close attention to how these polls are
conducted. By definition, all polling agencies (e.g., Gallup Poll) use sampling
procedures to gather responses from a subset of the population and assume that
these answers reflect the larger population’s views.
Often, selective sampling makes the results of a poll more precise:
collecting and handling smaller volumes of responses decreases the chances of
human error and data mismanagement. Sampling strategies also save time at the
data collection and data entry stages of polling. This means that the responses
can be analyzed and interpreted faster, which makes results of the survey more
timely and relevant. In this way, sampling strategies allow survey results to
inform and influence policy in close to "real-time."
The advantages of sampling are only realized, however, when every effort is
made to represent all of the relevant population subgroups within the sample. In
other words, the important characteristics (e.g., age, gender, race/ethnicity,
income, educational level) are distributed similarly in both the sample and the
population. This is particularly important when demographic characteristics
might influence respondents’ answers.
The problems of non-representative sampling become clearly evident in an
example. CNN’s Web site features an on-line, real-time poll called QuickVote.
It features a single question, with various response alternatives to choose
from. Once an answer has been given, the response is added to the others, and
the results (to date) are provided to the user. On October 28th, the question
was "Should the U.S. fight in the Afghan winter?" At about 4 p.m. PST,
a total of 44,431 votes had been cast, with 38,107 voting "yes" (86%)
and 6,324 voting "no" (14%).
This result certainly tells us that the vast majority of respondents were in
favor of more fighting. But does it tell us that the vast majority of the
American population was in favor of more fighting? Not necessarily. Consider who
responded to the survey: 44,431 people who have access to the Internet and who
logged onto CNN’s Web site (assuming that each person only voted once).
This sample cannot be assumed to be representative of Internet users in
general, as those who visit CNN’s Web site may have different views about the
war than those who do not visit the site. In addition, Internet users cannot be
assumed to represent the general public. People with access to the Internet are
disproportionately from higher income groups. Socioeconomic status might also
play an important influence on people’s opinions about the fighting in
Afghanistan.
The CNN.com QuickVote makes use of a group that already exists in the world,
i.e. Internet users who visited CNN’s Web site. This type of convenience
sample is self-selected, and thus is not likely to be representative of the
larger population of Americans across a variety of demographic characteristics,
especially as regards income, educational level, and social class.
Does this mean that the results of on-line public opinion polling are
meaningless? Again, not necessarily—it does mean that the sample is likely to
be biased, however, and the direction of the bias (as reflected in responses) is
very difficult to discern. Internet sampling has a place, as does telephone
sampling, but truly representative sampling requires that the sample much more
accurately represents the population as a whole.