Executive Director's Column

What is Evidence and What is the Problem?

One of the catch phrases around policy-making circles is "evidence-based," applied to a host of contents including education, policy, practice, medicine, even architecture.

By Merry Bullock, PhD

These days, you can hear the terms "good science," "evidence," and "data" a lot in Washington. One of the catch phrases around policy-making circles is "evidence-based," applied to a host of contents including education, policy, practice, medicine, even architecture. You would think that this would make us all quite happy - at least those who advocate that decisions about policy, social interventions, and future directions be based on data. But, ironically, the new emphasis on evidence-based this and that has been simultaneously welcomed and greeted with raised anxiety levels and red flags of concern.

Why Might This Be?

One reason is that at times the definition of the "good" science that is to inform policy seems tinged with political overtones. So, for example, some scientists have complained that although Congress and the Administration regularly call for reliance on the best science, they manipulate that science - they choose the science they like, represent it in a way that no scientist would understand, or set the bar so high that no scientific study can meet it. Probably the best examples are climate change, evolution, and environmental issues - although sound science (one definition of "good science") has reached consensus on data and policy implications, the existence of a few who argue otherwise give rise to policies that seem to say the facts are still in doubt.

Let me turn now to the behavioral and social sciences. Although there are certainly political overtones to some of the issues dealt with by the behavioral and social sciences (witness the recent slate of queries into the science of sexual behavior, or health disparities), there are other concerns with the "evidence-based" movement outside the political arena. The issues range from concerns about the ways in which evidence is defined, to concerns that experimental designs are inappropriately reified as the methodology that automatically yields the "best" evidence.

Some uneasiness with the current evidence-based movement may arise from an understandable resettling as changes in the funding and policy landscape become more clear (one good example is the new research portfolio of the Institute of Education Sciences). But much of the uneasiness appears based on more fundamental issues that address what we understand research to be, the world to be, and science to be.

Let Me Address Just a Couple of These Basic Issues.

In some discussions of what it means to be "evidence based," random assignment and experimental control (a.k.a. random controlled trials) are held as the gold standard. This raises red flags for many who do research that is not of this ilk. Card carrying scientists who do qualitative, quasi-experimental, or historical research are understandably troubled by the suggestion that only experiments qualify as real science. One common argument against the reification of experiments is that much of the evidence we take as incontrovertible is not experimental - evidence from disciplines such as epidemiology or astronomy, for example. And much sound policy is based on correlational, not experimental data, such as data on the relation between tobacco use and cancer. Although the science to which these arguments against the reification of random control refer is sound, I believe that these arguments nonetheless miss the point. My understanding has always been that when experimental design (including random assignment) is held as a gold standard, it is not for all science, but for intervention studies - when the goal, in the simplest case, is to "hold everything constant" except one variable, to enable clear causal inferences. In the case of much behavioral-social science questions, the variable might be a lot more complex - a program, a social intervention, and so on. That this gold standard can allow clear causal inference (and is the only standard for unequivocal causal inference) does not mean that other methods cannot also provide important knowledge, especially systematic description, categorization or correlation.

Another area of concern is that, even if one wanted to apply such a standard, experimental designs may be inappropriate or impossible in many of the complex, multidimensional contexts in which one needs answers, because such methods would be impossible, impractical, or unethical to fulfill. In many settings, for example, random assignment of individuals to programs, classrooms, neighborhoods, families, or treatment is often not possible and random assignment of programs to groups such as schools or teams or treatment settings may not be feasible. Is this a reason for concern? It is, of course, an instance of the classic difference between efficacy and efficiency - between finding out whether something works in the laboratory or well-controlled conditions and whether it works in practice in the messy, everyday world. In healthcare, one arena in which the evidence-base issues have been most thoroughly discussed, the conclusions are that both are necessary, and that one must be diligent in matching conclusion to design. And in medicine, as in psychology, applying knowledge to practice must always be a dance of best available information and expert judgment.

If one moves outside of psychology, there are broader concerns - the standard methods of sister social science disciplines are not usually experimental. Take anthropology or economics or survey research. The data gathered by economists or anthropologists or sociologists often inform policy decisions. Yet these data are rarely experimental. The lesson from looking across disciplines, questions and contexts, is that different designs may be appropriate for different questions, behaviors, or situations. What is, of course, important is that we aspire to using the most rigorous design appropriate and possible for the issues at hand, and that we convey the importance of that rigor to policy makers.

Because the evidence-based issues are so hot and so important for all psychologists to address, from researchers to practitioners, it is especially gratifying to see that the National Academy of Sciences is beginning an initiative to help define evidentiary standards across behavioral and social sciences, to help ask how to match evidence to question and context, and to help improve the translation of research into policy. This initiative will begin this month with a "Workshop on Policy Making: How Behavioral and Cognitive Scientists can Contribute…" and will continue with questions that look at the evidentiary bases of the behavioral and social sciences and the degree to which discussions of evidence in other disciplines (e.g., medicine, physics and so on) provide informative models.

It is clear that discussions of definitions of evidence, distinctions among kinds of evidence (including scientific data, expert judgment, observation, and theory), and consensus on when to use what, will occupy us for some time. Psychology needs to be an active participant in the discussion. It needs to contribute its unique insights as a discipline that has built its basic science on solid experimental methods, that continually grapples with the transition from basic laboratory science to applied science, that attempts the translation from science to application and to practice, and that promotes the importance of a basic science base that is relevant to application.