The Reliability and Validity of Intelligence Tests
by taratuta
Comments
Transcript
The Reliability and Validity of Intelligence Tests
278 Chapter 7 Thought, Language, and Intelligence on a test of reasoning, but when you take the same test the next day, you get a very low score. Your reasoning ability probably didn’t change much overnight, so the test is probably unreliable. The higher the reliability of a test, the less likely it is that scores have been affected by temperature, hunger, or other random and irrelevant changes in the environment or the test taker. Most scales reliably measure your weight, giving you about the same reading day after day. But what if you use these readings as a measure of your height? This far-fetched example illustrates that a reliable scale reading can be incorrect, or invalid, if it is misinterpreted. The same is true of tests. Even the most reliable test might not provide a correct, or valid, measure of intelligence, of anxiety, of typing skill, or of anything else if those are not the things the test really measures. In other words, we can’t say that a test itself is “valid” or “invalid.” Instead, validity refers to the degree to which test scores are interpreted appropriately and used properly (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999; Messick, 1989). As in our scale example, a test can be valid for one purpose but invalid for another. Researchers evaluate the reliability of a test by obtaining two sets of scores on the same test from the same people. They then calculate a correlation coefficient between the two sets of scores (see the introductory chapter). When the correlation is high and positive (usually above .80), the test is considered reliable. Evaluating a test’s validity usually means calculating a correlation coefficient between test scores and something else. What that “something else” is depends on what the test is designed to measure. Suppose, for example, you wanted to know if a creativity test is valid for identifying creative people. You could do so by computing the correlation between people’s scores on the creativity test and experts’ judgments about the quality of those same people’s artistic creations. If the correlation is high, the test has high validity as a measure of creativity. Validity If only measuring intelligence were this easy! © The New Yorker Collection 1998 J.B. Handelsman from Cartoonbank.com. All Rights Reserved. The Reliability and Validity of Intelligence Tests The reliability of intelligence tests is generally evaluated on the basis of their stability, or consistency. The validity of intelligence tests is usually based on their accuracy in guiding statements and predictions about people’s cognitive abilities. IQ scores obtained before the age of seven are only moderately correlated with scores on intelligence tests given later. There are two key reasons. First, the test items used with very young children are different from those used with older children. Second, cognitive abilities change rapidly in the early years (see the chapter on human development). Still, during the school years, IQ scores tend to remain stable (Allen & Thorndike, 1995; Mayer & Sutton, 1996). For teenagers and adults, the reliability of intelligence tests is high, generally between .85 and .95. Of course, a person’s score may vary from one occasion to another if there are significant changes in motivation, anxiety, health, or other factors. Overall, though, modern IQ tests usually provide exceptionally consistent results, especially compared with most other kinds of mental tests. Reliability If everyone agreed on exactly what intelligence is (having a good memory, for example), we could evaluate the validity of IQ tests simply by correlating people’s IQ scores with their performance on various tasks (in this case, memory tasks). IQ tests whose scores correlated most highly with scores on memory tests would be the most valid measures of intelligence. But because psychologists do not fully agree on a single definition of intelligence, they don’t have a single standard against which to compare intelligence tests. Therefore, they cannot say whether these tests are valid measures of intelligence. Because intelligence is always displayed in the course of specific tasks and specific social situations, psychologists can only assess the validity of intelligence tests for specific purposes. Validity validity The degree to which test scores are interpreted appropriately and used properly.