Assessment: Alex

Reliability and Validity of Standardized Tests
Chong Ho Yu, Ph.D.s

Reliability of Standardized Tests

An acceptable standardized test should have reliability coefficients of about:

.95 for internal consistency

.90 for test-retest (stability)

.85 for alternate-forms (equivalency)

Validity of Standardized Tests

Concurrent validity, which is a form of criteron-related validity, is often reported. This is a numerical estimates of the extent to which a test correlates with another established test or tests. A valid achievement test may have concurrent validity coefficients ranging from .6 to .9.

It is most important to evaluate the content validity of standard achievement tests: Does the test items match my instructional objectives?

High reliability

Internal consistency is generally high. Items are written by specialists, pretested, and selected on the basis of the results of a quantitative item analysis.

High degree of validity in terms of prediction

Students in an Ohio community took the Iowa Test of Basic Skills in the spring of their eighth-grade year and the Ohio Ninth Grade Proficiency Test (ONGPT) in the following fall of their ninth-grade year.
The scores from both tests, for a period of 3 consecutive years, were correlated to determine the predictability of passing or failing the ONGPT, based on the standardizedtest scores. The correlations were found to be significant for reading, mathematics, writing, and citizenship.

It was also shown that the percentages of students failing the ONGPT who scored below the third stanine were high in three areas (reading--60 percent, mathematics--93 percent, and citizenship--90 percent). Therefore, stanine scores can be helpful predictors of the need for intervention programs.

Standardized tests are impersonal

Informal evaulation is subject to Hawthrone effect and Halo effect. However, standardized tests are robust against human attitudes.

During the course of a large-scale study of teacher preparation for standardized tests in two low-stakes school districts, the question of whether the attitudes of teachers about standardized tests are a factor in their administration of mandated standardized tests in the classroom was addressed.

Third-, fourth-, and sixth-grade teachers (n=178) completed a survey of perceptions of standardized tests that also asked how much time teachers spent preparing their classes for the tests.

Students in one school district took the Comprehensive Test of Basic Skills in grades 3 and 6, while fourth graders in the other district took the Iowa Test of Basic Skills.

Teachers' perceptions of standardized tests varied, but were not consistently related either to the effort that teachers put into preparing for the tests or to their students' test performance. It was concluded that teachers administered the tests in ways largely uninfluenced by their personal feelings as reflected in the Likert items of the survey.