Cover Story

Standardized intelligence testing has been called one of psychology's greatest successes. It is certainly one of the field's most persistent and widely used inventions.

Since Alfred Binet first used a standardized test to identify learning-impaired Parisian children in the early 1900s, it has become one of the primary tools for identifying children with mental retardation and learning disabilities. It has helped the U.S. military place its new recruits in positions that suit their skills and abilities. And, since the administration of the original Scholastic Aptitude Test (SAT)--adapted in 1926 from an intelligence test developed for the U.S. Army during World War I--it has spawned a variety of aptitude and achievement tests that shape the educational choices of millions of students each year.

But intelligence testing has also been accused of unfairly stratifying test-takers by race, gender, class and culture; of minimizing the importance of creativity, character and practical know-how; and of propagating the idea that people are born with an unchangeable endowment of intellectual potential that determines their success in life.

Since the 1970s, intelligence researchers have been trying to preserve the usefulness of intelligence tests while addressing those concerns. They have done so in a number of ways, including updating the Wechsler Intelligence Scale for Children (WISC) and the Stanford-Binet Intelligence Scale so they better reflect the abilities of test-takers from diverse cultural and linguistic backgrounds. They have developed new, more sophisticated ways of creating, administering and interpreting those tests. And they have produced new theories and tests that broaden the concept of intelligence beyond its traditional boundaries.

As a result, many of the biases identified by critics of intelligence testing have been reduced, and new tests are available that, unlike traditional intelligence tests, are based on modern theories of brain function, says Alan Kaufman, PhD, a clinical professor of psychology at the Yale School of Medicine.

For example, in the early 1980s, Kaufman and his wife, Nadeen Kaufman, EdD, a lecturer at the Yale School of Medicine, published the Kaufman Assessment Battery for Children (K-ABC), then one of the only alternatives to the WISC and the Stanford-Binet. Together with the Woodcock-Johnson Tests of Cognitive Ability, first published in the late 1970s, and later tests, such as the Differential Ability Scales and the Cognitive Assessment System (CAS), the K-ABC helped expand the field of intelligence testing beyond the traditional tests.

Nonetheless, says Kaufman, there remains a major gap between the theories and tests that have been developed in the past 20 years and the way intelligence tests are actually used. Narrowing that gap remains a major challenge for intelligence researchers as the field approaches its 100th anniversary.

King of the hill

Among intelligence tests for children, one test currently dominates the field: the WISC-III, the third revision of psychologist David Wechsler's classic 1949 test for children, which was modeled after Army intelligence tests developed during World War I.

Since the 1970s, says Kaufman, "the field has advanced in terms of incorporating new, more sophisticated methods of interpretation, and it has very much advanced in terms of statistics and methodological sophistication in development and construction of tests. But the field of practice has lagged woefully behind."

Nonetheless, people are itching for change, says Jack Naglieri, PhD, a psychologist at George Mason University who has spent the past two decades developing the CAS in collaboration with University of Alberta psychologist J.P. Das, PhD. Practitioners want tests that can help them design interventions that will actually improve children's learning; that can distinguish between children with different conditions, such as a learning disability or attention deficit disorder; and that will accurately measure the abilities of children from different linguistic and cultural backgrounds.

Naglieri's own test, the CAS, is based on the theories of Soviet neuropsychologist A.R. Luria, as is Kaufman's K-ABC. Unlike traditional intelligence tests, says Naglieri, the CAS helps teachers choose interventions for children with learning problems, identifies children with learning disabilities and attention deficit disorder and fairly assesses children from diverse backgrounds. Now, he says, the challenge is to convince people to give up the traditional scales, such as the WISC, with which they are most comfortable.

According to Nadeen Kaufman, that might not be easy to do. She believes that the practice of intelligence testing is divided between those with a neuropsychological bent, who have little interest in the subtleties of new quantitative tests, and those with an educational bent, who are increasingly shifting their interest away from intelligence and toward achievement. Neither group, in her opinion, is eager to adopt new intelligence tests.

For Naglieri, however, it is clear that there is still a great demand for intelligence tests that can help teachers better instruct children with learning problems. The challenge is convincing people that tests such as the CAS--which do not correlate highly with traditional tests--still measure something worth knowing. In fact, Naglieri believes that they measure something even more worth knowing than what the traditional tests measure. "I think we're at a really good point in our profession, where change can occur," he says, "and I think that what it's going to take is good data."

Pushing the envelope

The Kaufmans and Naglieri have worked within the testing community to effect change; their main concern is with the way tests are used, not with the basic philosophy of testing. But other reformers have launched more fundamental criticisms, ranging from "Emotional Intelligence" (Bantam Books, 1995), by Daniel Goleman, PhD, which suggested that "EI" can matter more than IQ (see article on page 52), to the multiple intelligences theory of Harvard University psychologist Howard Gardner, PhD, and the triarchic theory of successful intelligence of APA President Robert J. Sternberg, PhD, of Yale University. These very different theories have one thing in common: the assumption that traditional theories and tests fail to capture essential aspects of intelligence.

But would-be reformers face significant challenges in convincing the testing community that theories that sound great on paper--and may even work well in the laboratory--will fly in the classroom, says Nadeen Kaufman. "A lot of these scientists have not been able to operationalize their contributions in a meaningful way for practice," she explains.

In the early 1980s, for example, Gardner attacked the idea that there was a single, immutable intelligence, instead suggesting that there were at least seven distinct intelligences: linguistic, logical-mathematical, musical, bodily-kinesthetic, spatial, interpersonal and intrapersonal. (He has since added existential and naturalist intelligences.) But that formulation has had little impact on testing, in part because the kinds of quantitative factor-analytic studies that might validate the theory in the eyes of the testing community have never been conducted.

Sternberg, in contrast, has taken a more direct approach to changing the practice of testing. His Sternberg Triarchic Abilities Test (STAT) is a battery of multiple-choice questions that tap into the three independent aspects of intelligence--analytic, practical and creative--proposed in his triarchic theory.

Recently, Sternberg and his collaborators from around the United States completed the first phase of a College Board-sponsored Rainbow Project to put the triarchic theory into practice. The goal of the project was to enhance prediction of college success and increase equity among ethnic groups in college admissions. About 800 college students took the STAT along with performance-based measures of creativity and practical intelligence.

Sternberg and his collaborators found that triarchic measures predicted a significant portion of the variance in college grade point average (GPA), even after SAT scores and high school GPA had been accounted for. The test also produced smaller differences between ethnic groups than did the SAT. In the next phase of the project, the researchers will fine-tune the test and administer it to a much larger sample of students, with the ultimate goal of producing a test that could serve as a supplement to the SAT.

Questioning the test

Beyond the task of developing better theories and tests of intelligence lies a more fundamental question: Should we even be using intelligence tests in the first place?

In certain situations where intelligence tests are currently being used, the consensus answer appears to be "no." A recent report of the President's Commission on Excellence in Special Education (PCESE), for example, suggests that the use of intelligence tests to diagnose learning disabilities should be discontinued.

For decades, learning disabilities have been diagnosed using the "IQ-achievement discrepancy model," according to which children whose achievement scores are a standard deviation or more below their IQ scores are identified as learning disabled.

The problem with that model, says Patti Harrison, PhD, a professor of school psychology at the University of Alabama, is that the discrepancy doesn't tell you anything about what kind of intervention might help the child learn. Furthermore, the child's actual behavior in the classroom and at home is often a better indicator of a child's ability than an abstract intelligence test, so children might get educational services that are more appropriate to their needs if IQ tests were discouraged, she says.

Even staunch supporters of intelligence testing, such as Naglieri and the Kaufmans, believe that the IQ-achievement discrepancy model is flawed. But, unlike the PCESE, they don't see that as a reason for getting rid of intelligence tests altogether.

For them, the problem with the discrepancy model is that it is based on a fundamental misunderstanding of the Wechsler scores, which were never intended to be used to as a single, summed number. So the criticism of the discrepancy model is correct, says Alan Kaufman, but it misses the real issue: whether or not intelligence tests, when properly administered and interpreted, can be useful.

"The movement that's trying to get rid of IQ tests is failing to understand that these tests are valid in the hands of a competent practitioner who can go beyond the numbers--or at least use the numbers to understand what makes the person tick, to integrate those test scores with the kind of child you're looking at, and to blend those behaviors with the scores to make useful recommendations," he says.

Intelligence tests help psychologists make recommendations about the kind of teaching that will benefit a child most, according to Ron Palomares, PhD, assistant executive director in the APA Practice Directorate's Office of Policy and Advocacy in the Schools. Psychologists are taught to assess patterns of performance on intelligence tests and to obtain clinical observations of the child during the testing session. That, he says, removes the focus from a single IQ score and allows for an assessment of the child as a whole, which can then be used to develop individualized teaching strategies.

Critics of intelligence testing often fail to consider that most of the alternatives are even more prone to problems of fairness and validity than the measures that are currently used, says APA President-elect Diane F. Halpern, PhD, of Claremont McKenna College.

"We will always need some way of making intelligent decisions about people," says Halpern. "We're not all the same; we have different skills and abilities. What's wrong is thinking of intelligence as a fixed, innate ability, instead of something that develops in a context."