...

The Reliability and Validity of Intelligence Tests

by taratuta

on
Category: Documents
22

views

Report

Comments

Transcript

The Reliability and Validity of Intelligence Tests
278
Chapter 7 Thought, Language, and Intelligence
on a test of reasoning, but when you take the same test the next day, you get a very
low score. Your reasoning ability probably didn’t change much overnight, so the test is
probably unreliable. The higher the reliability of a test, the less likely it is that scores
have been affected by temperature, hunger, or other random and irrelevant changes in
the environment or the test taker.
Most scales reliably measure your weight, giving you about the same reading day after day. But what if you use these readings as a measure of your height? This
far-fetched example illustrates that a reliable scale reading can be incorrect, or invalid,
if it is misinterpreted. The same is true of tests. Even the most reliable test might not
provide a correct, or valid, measure of intelligence, of anxiety, of typing skill, or of anything else if those are not the things the test really measures. In other words, we can’t
say that a test itself is “valid” or “invalid.” Instead, validity refers to the degree to which
test scores are interpreted appropriately and used properly (American Educational
Research Association, American Psychological Association, & National Council on
Measurement in Education, 1999; Messick, 1989). As in our scale example, a test can
be valid for one purpose but invalid for another.
Researchers evaluate the reliability of a test by obtaining two sets of scores on the same
test from the same people. They then calculate a correlation coefficient between the two
sets of scores (see the introductory chapter). When the correlation is high and positive
(usually above .80), the test is considered reliable. Evaluating a test’s validity usually
means calculating a correlation coefficient between test scores and something else. What
that “something else” is depends on what the test is designed to measure. Suppose, for
example, you wanted to know if a creativity test is valid for identifying creative people.
You could do so by computing the correlation between people’s scores on the creativity
test and experts’ judgments about the quality of those same people’s artistic creations. If
the correlation is high, the test has high validity as a measure of creativity.
Validity
If only measuring intelligence were
this easy!
© The New Yorker Collection 1998 J.B. Handelsman from
Cartoonbank.com. All Rights Reserved.
The Reliability and Validity of Intelligence Tests
The reliability of intelligence tests is generally evaluated on the basis of their stability,
or consistency. The validity of intelligence tests is usually based on their accuracy in
guiding statements and predictions about people’s cognitive abilities.
IQ scores obtained before the age of seven are only moderately correlated with scores on intelligence tests given later. There are two key reasons. First, the
test items used with very young children are different from those used with older children. Second, cognitive abilities change rapidly in the early years (see the chapter on
human development). Still, during the school years, IQ scores tend to remain stable
(Allen & Thorndike, 1995; Mayer & Sutton, 1996). For teenagers and adults, the reliability of intelligence tests is high, generally between .85 and .95.
Of course, a person’s score may vary from one occasion to another if there are significant changes in motivation, anxiety, health, or other factors. Overall, though, modern IQ tests usually provide exceptionally consistent results, especially compared with
most other kinds of mental tests.
Reliability
If everyone agreed on exactly what intelligence is (having a good memory,
for example), we could evaluate the validity of IQ tests simply by correlating people’s
IQ scores with their performance on various tasks (in this case, memory tasks). IQ tests
whose scores correlated most highly with scores on memory tests would be the most
valid measures of intelligence. But because psychologists do not fully agree on a single
definition of intelligence, they don’t have a single standard against which to compare
intelligence tests. Therefore, they cannot say whether these tests are valid measures of
intelligence. Because intelligence is always displayed in the course of specific tasks and
specific social situations, psychologists can only assess the validity of intelligence tests
for specific purposes.
Validity
validity The degree to which test
scores are interpreted appropriately and
used properly.
Fly UP