How to Be A Wise Consumer of Psychological Research

Evaluate research-based claims to become a better consumer of products and services that shape your daily life.

It is difficult to turn the pages of a newspaper without coming across a story that makes an important claim about human nature: what causes divorce, how men and women differ psychologically, how work-related stress influences physical illness. Surveys analyze how people will vote in an upcoming election or what proportion of Americans routinely wear seat belts. Advertisements promise everything from improving your memory by listening to subliminal tapes, to helping you become more popular. Being able to evaluate research claims objectively is an important skill. Separating the scientific wheat from the chaff can influence how you vote, whether you adopt a new diet, or whether you decide to get professional help for a child with a learning disorder. As consumers of both products and ideas, we all need to know the difference between carefully and poorly conducted research.

Show Me the Data! Looking at Evidence

The most important lesson about being a wise consumer of psychological research is that, from a scientific perspective, all claims require evidence, not just opinions. Scientists who evaluate research claims behave like ideal jury members who are asked to evaluate claims made by prosecuting attorneys. They begin with the skeptical assumption that all claims are false (the defendant is innocent until proven guilty; the diet plan is ineffective; testosterone plays no role in aggression). Only after considering the strengths and weaknesses of the evidence relevant to a claim do jurors and scientists decide whether to accept the claims of those doing the claiming (prosecuting attorneys, advertisers, scientists). This decision to accept or reject a claim is best made by paying careful attention to the methods that served as the basis for a specific claim. 

In short, sound research methods lead to more valid research conclusions. 

Behavioral scientists have hundreds of tools in their methodological toolboxes; two of these tools turn out to be much more important than any others. Understanding the nature and purpose of these two tools is thus the first step to becoming an educated consumer of psychological research. The two tools that lie at the heart of sound research methods are random sampling and experimental manipulation based on random assignment.

Says Who? Random Sampling

When behavioral scientists want to assess the attitudes or preferences of very large groups of people (e.g., American voters, Asian-American college students, human beings), they face a seemingly insurmountable problem. It is usually impossible to ask every member of a very large group what he or she thinks, feels or does. However, behavioral scientists have solved this tricky problem by developing a technique called random sampling. When survey researchers use random sampling, they select a very small proportion of the people from within a very large sample (e.g., 1,000 out of 50 million registered voters). They then estimate what the entire population is like on the basis of the responses of those sampled. The key to getting an accurate estimate is the use of random sampling. Random sampling refers to selecting people from a population so that everyone in the entire population (e.g., all registered voters in the U.S.) has an equal chance of being selected. This turns out to be an incredibly powerful technique. If every person in a group of 50 million voters really does have an equal chance of being selected into a national survey, then the results of the survey based on 1,000 people will almost always prove to resemble the results for the total population.

An excellent example of the importance of random sampling can be found in the 1936 U.S. presidential election. Prior to that election, the Literary Digest sent postcards to more than 10 million Americans, asking them to report who they planned to vote for in the upcoming election. Among the 2 million Americans who returned the postcards, Alf Landon was the overwhelming favorite. In contrast, a much smaller survey conducted by the recently-formed Gallup group yielded very different results. Based on the responses of only a few thousand likely voters, the Gallup poll suggested that Franklin D. Roosevelt would be the winner. If you pull a dime out of your pocket, and look to see whose face is there, you'll see that the Gallup pollsters were correct. FDR won in a landslide, and Alf Landon faded into obscurity. How did the Gallup poll, based on many fewer people, outperform the enormous Literary Digest poll? The Gallup pollsters came very close to performing a true random sample of likely voters. In contrast, the Literary Digest sampled people by taking names from automobile registrations and telephone listings. In 1936, people who owned cars and phones were usually pretty wealthy — and wealthy people overwhelming preferred Alf Landon.

The lesson of the Literary Digest error is that whenever you hear the results of any survey, you should ask yourself how the surveyed people were sampled. Were those sampled really like the pool of people (e.g., American voters, African-American children) whose attitudes and behavior the researcher would like to describe?

Even when a researcher makes careful use of random sampling, it is also useful to pay attention to a different form of sampling bias, known as non-response bias. If only a small percentage of randomly sampled people agree to respond to a survey, it is quite likely that those who did respond will be different than those who refused. Modern pollsters have long mastered the science of random sampling. These days, most of the error in most scientific polls is based on the fact that it can be hard to get very high response rates (or hard to know who to sample in the first place). For example, if you randomly sampled all those eligible to vote in a state gubernatorial race and you only got a 30 percent response rate, you would have to worry about whether those who refused to be surveyed would vote the same way as the eager 30 percent who agreed. Moreover, even if everyone agreed to be surveyed, you'd have to worry about whether the sub sample of all eligible voters who actually showed up at the polls on election day had the same preferences as those who either didn't bother to vote or were unable to do so.

It is also important to note that random sampling helps you describe only the population of people from whom you sampled (and not other populations). For example, if researchers randomly sampled registered voters, but only did so in North Carolina, they might get a great idea of what North Carolinians believe. It would be very risky to generalize these results to other Americans. This is why people sometimes criticize the results of surveys taken of college students, who differ markedly from older adults. On the other hand, if surveyors wanted to know the opinions of college students, it would make little sense to sample anyone else. The key issue might be exactly which college students. A random sample of 1,000 American college students would tell us much more than a random sample of 1,000 students at Vassar College. Of course, if we cared only about Vassar College students, we would want to sample Vassarians at random. The key issue in sampling is to pay careful attention to who was sampled and to make certain that those sampled are the same kind of people about whom a researcher has made a claim (a claim about what the evidence shows).

How to Ask Why: Random Assignment and Experimental Manipulations

When a researcher moves from descriptive research to experimental research, random sampling is still important, but it begins to take a back seat to a second major technique. This second technique is random assignment, and it is the cornerstone of the experimental method. Unlike random sampling, which is a technique for deciding who to study, random assignment can take place only after people have already been selected into a study. 

Random Assignment

Random assignment is a technique for assigning people to different specific conditions in an experiment; it occurs only when everyone in the study has an equal chance of serving in any specific condition. In the same way that random sampling guarantees that the people sampled in a study will be as similar as possible to those who were not sampled, random assignment guarantees that those assigned to one experimental condition will be as similar as possible to those assigned to a different condition. This is crucial because the whole idea of an experiment is to identify two identical groups of people and then to manipulate something. One group gets an experimental treatment, and one does not. If the group that gets the treatment (e.g., a drug, exposure to a violent video game) behaves differently than the control group that did not get the treatment, we can attribute the difference to the treatment — but only if we can rest assured that the two groups were similar prior to the treatment.

In other words, if we wish to identify the causes of human behavior, we must usually perform experiments in which we manipulate one thing, or a few factors, at a time. We can only do this by making use of random assignment. 

Suppose a researcher at Cornell University developed a new technique for teaching foreign language. If the researcher could do so, he might persuade all of his colleagues in the Spanish department to start using this new technique. After a year of instruction using the new technique, suppose that the professor documented that the average student who completed one year of Spanish at Cornell performed well above the national average in a test of Spanish fluency (relative to students at other universities who had also completed a year of Spanish). Can we attribute this performance advantage to the new instruction technique? Given how difficult it is to get admitted to Cornell in the first place, it is likely that students at Cornell would have performed well above the national norm even if they had been taught using a new technique. If the researcher really wanted to know if his teaching technique was superior, he would have needed to randomly assign some Cornell students to receive the new form of instruction while randomly assigning others to receive a traditional form of instruction (this would be hard to do, but that is a detail).

Consider a more important question. Do seatbelts save lives? One way to find out would be to obtain records of thousands of serious automobile accidents. To simplify things, suppose a researcher focused exclusively on drivers (rather than passengers) and found an accurate way to determine whether drivers were wearing their seatbelts at the time of each crash. The researcher then obtained accurate records of whether the driver in each crash survived. Imagine that drivers wearing seatbelts were much more likely to have survived. Can we safely assume that seatbelts are the reason? Not on the basis of this study alone. The problem is that, for ethical reasons, the people in this hypothetical study were not randomly assigned to different seatbelt conditions. As it turns out, those who do and do not routinely wear seatbelts differ in many important ways. Compared with habitual non-users of seatbelts, habitual users are older, more educated, and less likely to speed or drink and drive. These additional factors are also likely to influence survival in a serious accident, and they are all confounded with seatbelt use. On the basis of this study and this study alone, we cannot tell whether it is seatbelts or other safe driving practices that are responsible for the greater survival rates among seatbelt users.

If we were to conduct a large-scale experiment on seatbelt use (by determining habitual seatbelt use on the basis of coin flips), we could completely eliminate all of these confounds in one simple step. Random assignment would create two identical groups of people, exactly half of whom were forced to use seatbelts at all times, and exactly half of whom were forbidden from doing so during the experimental period. Of course, this hypothetical experiment would be unethical. Thus, researchers interested in seatbelt use have had to do a lot of other things to document the important role that seatbelts play in saving people's lives (including laboratory crash tests and studies that used sophistical statistical techniques to separate the effects of seatbelt use from other effects). The point is not that seatbelts don't save lives. They clearly do. The point is that it has taken a lot of time and effort to document this fact because of the impossibility of conducting an experiment on this topic. If you want to conduct a single study to figure out what causes something, you will almost always need to conduct an experiment in which you make use of random assignment. As a consumer of psychological research, you must thus ask yourself whether a research claim was based on the results of a careful experiment, or whether a researcher may have compared two groups of people who differed in more than one way at the beginning of the study.

Longitudinal Research

Sometimes a researcher can bypass the use of random assignment by comparing people with themselves — conducting a longitudinal study with a pretest and a post-test. Although such studies can be very informative, these studies often come with their own special kinds of confounds. Many of these confounds boil down to the fact that people can and do change over time, for many reasons. 

For example, consider GRE prep courses. When a student who scores poorly on the GRE takes a preparation course and then takes the GRE again, he or she will often do better the second time around. This would seem to show that the prep course is effective. However, it has also proven effective for the student to simply retake the test without any intervention; in most cases, he or she will improve on the second test. (The reason why low scorers tend to improve in the absence of training is known as "regression toward the mean," but its details are beyond the scope of this short essay.) The key issue is that it is always important to have a control group if you want to assess the impact of a treatment.

Ask the Right Questions

There are many other ways in which research can go astray. 

  • Did the researcher word his survey questions fairly? 
  • Were participants reporting their attitudes honestly? 
  • Did the researcher bias answers by subtly communicating to participants what they hoped to find? 
  • Was the size of the sample large enough to draw meaningful comparisons? (If you read that 4 in 5 doctors use Brand X, were only five doctors surveyed?) 
  • Were those who conducted the research strongly motivated to produce a specific result? (If those studying the effects of a drug were paid by a pharmaceutical company to do the research, could this conflict of interest distort the way they collect or interpret their data.) 

The list continues. Specific issues such as these aside, the two concerns that should come to mind first when evaluating any research claim have to do with proper sampling and proper experimental control: 

  • Were those studied truly representative of the people about whom we would like to draw conclusions? 
  • Did the researchers isolate the variables they studied by disentangling them from other confounded variables? 

It is not always easy to get answers to these questions, but if you get in the habit of asking them you will gradually become a better shopper for psychological truths.