Abstract:
This is a theoretical research paper about the technical and practical issues on test equating. In this paper the researcher addresses the issue of test equating in detail. The paper is divided into six major parts. The first part deals with the definition of test equating, purposes of test equating, and the necessary conditions for equating. The second part tackles the main data collection designs that are commonly used for test equating, such as single group design, counterbalanced random groups design, equivalent groups design, anchor test random groups design, and anchor test nonequivalent groups design. The third part is devoted to various methods of test equating. Some methods are based on classical test theory (CTT), whereas others are based on item response theory (IRT). The classical methods of test equating include equipercentile equating and linear equating. The IRT methods include ability- score equating, true-score equating, and observed-score equating; utilizing the three IRT logistics models. The fourth part summarizes a number of research studies on test equating to detect the dominant equating method(s), and what method might be preferred, where the main criterion is the least equating error to judge the quality of any method. The fifth part is intended to discuss some technical and practical issues on test equating, such as equating nonparallel test, obtaining equivalent scores on tests, effects of small sample size on test equating, and the importance of communication among psychometricians to avoid misinterpretations of equating results. The last part is concerned with some conclusions and implications for educational research regarding the issue of test equating, such as sample size, methods of equating used, types of tests used for equating, the problem of multidimensionality, and equating multimedia-based tests.