Ability Estimation with Polytomous Items in Computerized Multistage Tests

Hasibe Yahsi Sarı; Hülya Kelecioğlu

doi:10.21031/epod.1056079

Research Article

Ability Estimation with Polytomous Items in Computerized Multistage Tests

Year 2023, Volume: 14 Issue: 3, 171 - 184, 30.09.2023

Hasibe Yahsi Sarı Hülya Kelecioğlu

https://doi.org/10.21031/epod.1056079

Abstract

The aim of the study is to examine how the ability estimations of individuals change under different conditions in tests consisting of polytomous items in an computerized multistage test environment. The research is a simulation study. In the study, 108 (3x3x6x2=108) conditions were examined consisting of three categories (3, 4 and 5), three test lengths (10, 20 and 30), six panel designs (1-2, 1-2-2, 1-3, 1-3-3, 1-4 and 1-4-4) and two routing methods (Maximum Fisher Information (MFI) and Random). Simulations and analyses were carried out in the mstR package in R program, with a pool of 200 items, 1000 people and 100 replications (e.g., iterations). As the outcomes of the research, mean absolute bias, RMSE and correlation values were calculated. It was found that as the number of categories and test length increase, the mean absolute bias and RMSE values decrease, while the correlation values increase. In terms of routing methods, although MFI and random methods have similar tendencies, MFI gives better results. There is a similarity between the panel designs in terms of results.

Keywords

Computerized multistage tests, polytomous items, routing methods

References

Chen, L-Y. (2010). An investigation of the optimal test design for multi-stage test using the generalized partial credit model. [Doctoral dissertation, The University of Texas]. UT Electronic Theses and Dissertations. Retrieved from https://repositories.lib.utexas.edu/handle/2152/ETD-UT-2010-12-344
Donoghue, J. R. (1994). An empirical examination of the IRT information of polytomously scored reading items under the generalized partial credit model. Journal of Educational Measurement, 31(4), 295-311. https://doi.org/10.1111/j.1745-3984.1994.tb00448.x
Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized Adaptive Testing with Polytomous Items. Applied Psychological Measurement, 19(1), 5-22. https://doi.org/10.1177/014662169501900103
Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press.
Han, K. T. (2007). WinGen: Windows software that generates item response theory parameters and item responses. Applied Psychological Measurement, 31(5), 457-459. DOI: 10.1177/0146621607299271
Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26(2), 44-52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
ILOG. (2006). ILOG CPLEX 10.0 [User’s manual]. Paris, France: ILOG S.A. Retrieved from https://www.lix.polytechnique.fr/~liberti/teaching/xct/cplex/usrcplex.pdf
Kim, J., Chung, H., & Dodd, B. G. (2010, May). Comparing routing methods in the multistage test based on the partial credit model [Conference presentation]. In AERA, Denver, CO.
Kim, J., Chung, H., Park, R., & Dodd, B. G. (2013). A comparison of panel designs with routing methods in the multistage test with the partial credit model. Behavior Research Methods, 45, 1087–1098. https://doi.org/10.3758/s13428-013-0316-3
Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer‐adaptive sequential testing. Journal of Educational Measurement, 35(3), 229-249. https://doi.org/10.1111/j.1745-3984.1998.tb00537.x
Luecht, R. M. (2000, April). Implementing the computer-adaptive sequential testing (CAST) framework to mass produce high quality computer-adaptive and mastery tests. [Conference presentation]. In NCME, New Orleans, LA. Retrieved from https://eric.ed.gov/?id=ED442823
Macken-Ruiz, C. L. (2008). A comparison of multi-stage and computerized adaptive tests based on the generalized partial credit model (Publication No. 3328282) [Doctoral dissertation, The University of Texas]. ProQuest Dissertations Publishing. Retrieved from https://www.proquest.com/docview/304482829?pq-origsite=gscholar&fromopenview=true
Magis, D., Yan, D., von Davier, A., & Magis, M. D. (2018). Package ‘mstR’. Retrieved from https://cran.r-project.org/web/packages/mstR/mstR.pdf
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. https://doi.org/10.1007/BF02296272
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. ETS Research Report Series, 1992(1), i–30. https://doi.org/10.1002/j.2333-8504.1992.tb01436.x
Öztürk, N. B. (2019). How the Length and Characteristics of Routing Module Affect Ability Estimation in ca-MST?. Universal Journal of Educational Research, 7(1), 164-170. doi: 10.13189/ujer.2019.070121
R Core Team. (2018). R: A language and environment for statistical computing: R foundation for statistical computing.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34 (17). Retrieved from https://psycnet.apa.org/record/1972-04809-001
Sari, H. I., & Raborn, A. (2018). What Information Works Best?: A Comparison of Routing Methods. Applied psychological measurement, 42(6), 499-515. https://doi.org/10.1177/0146621617752990
Sari, H.I., Yahsi Sari, H., & Huggins Manley, A.C. (2016). Computer adaptive multistage testing: Practical issues, challenges and principles. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 388-406. https://doi.org/10.21031/epod.280183
Weiss, D. J. (1982). Improving Measurement Quality and Efficiency with Adaptive Testing. Applied Psychological Measurement, 6(4), 473–492. https://doi.org/10.1177/014662168200600408
Weiss, D. J. (1983). Latent trait theory and adaptive testing. In Weiss D. J. (Ed.), New horizons in testing (pp. 5-7). Academic Press.
Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment (Publication No. 5710) [Doctoral dissertation, University of Massachusetts Amherst]. UMass Amherst Libraries. https://scholarworks.umass.edu/dissertations_1/5710
Zenisky A., Hambleton R.K.,& Luecht R.M. (2009) Multistage Testing: Issues, Designs, and Research. In: van der Linden W., Glas C. (eds) Elements of Adaptive Testing. Springer.
Zurovac, J., Cook, T. D., Deke, J., Finucane, M. M., Chaplin, D., Coopersmith, J. S., ... & Forrow, L. V. (2021). Absolute and Relative Bias in Eight Common Observational Study Designs: Evidence from a Meta-analysis. https://arxiv.org/ftp/arxiv/papers/2111/2111.06941.pdf

Year 2023, Volume: 14 Issue: 3, 171 - 184, 30.09.2023

Hasibe Yahsi Sarı Hülya Kelecioğlu

https://doi.org/10.21031/epod.1056079

Abstract

References

Chen, L-Y. (2010). An investigation of the optimal test design for multi-stage test using the generalized partial credit model. [Doctoral dissertation, The University of Texas]. UT Electronic Theses and Dissertations. Retrieved from https://repositories.lib.utexas.edu/handle/2152/ETD-UT-2010-12-344
Donoghue, J. R. (1994). An empirical examination of the IRT information of polytomously scored reading items under the generalized partial credit model. Journal of Educational Measurement, 31(4), 295-311. https://doi.org/10.1111/j.1745-3984.1994.tb00448.x
Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized Adaptive Testing with Polytomous Items. Applied Psychological Measurement, 19(1), 5-22. https://doi.org/10.1177/014662169501900103
Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press.
Han, K. T. (2007). WinGen: Windows software that generates item response theory parameters and item responses. Applied Psychological Measurement, 31(5), 457-459. DOI: 10.1177/0146621607299271
Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26(2), 44-52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
ILOG. (2006). ILOG CPLEX 10.0 [User’s manual]. Paris, France: ILOG S.A. Retrieved from https://www.lix.polytechnique.fr/~liberti/teaching/xct/cplex/usrcplex.pdf
Kim, J., Chung, H., & Dodd, B. G. (2010, May). Comparing routing methods in the multistage test based on the partial credit model [Conference presentation]. In AERA, Denver, CO.
Kim, J., Chung, H., Park, R., & Dodd, B. G. (2013). A comparison of panel designs with routing methods in the multistage test with the partial credit model. Behavior Research Methods, 45, 1087–1098. https://doi.org/10.3758/s13428-013-0316-3
Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer‐adaptive sequential testing. Journal of Educational Measurement, 35(3), 229-249. https://doi.org/10.1111/j.1745-3984.1998.tb00537.x
Luecht, R. M. (2000, April). Implementing the computer-adaptive sequential testing (CAST) framework to mass produce high quality computer-adaptive and mastery tests. [Conference presentation]. In NCME, New Orleans, LA. Retrieved from https://eric.ed.gov/?id=ED442823
Macken-Ruiz, C. L. (2008). A comparison of multi-stage and computerized adaptive tests based on the generalized partial credit model (Publication No. 3328282) [Doctoral dissertation, The University of Texas]. ProQuest Dissertations Publishing. Retrieved from https://www.proquest.com/docview/304482829?pq-origsite=gscholar&fromopenview=true
Magis, D., Yan, D., von Davier, A., & Magis, M. D. (2018). Package ‘mstR’. Retrieved from https://cran.r-project.org/web/packages/mstR/mstR.pdf
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. https://doi.org/10.1007/BF02296272
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. ETS Research Report Series, 1992(1), i–30. https://doi.org/10.1002/j.2333-8504.1992.tb01436.x
Öztürk, N. B. (2019). How the Length and Characteristics of Routing Module Affect Ability Estimation in ca-MST?. Universal Journal of Educational Research, 7(1), 164-170. doi: 10.13189/ujer.2019.070121
R Core Team. (2018). R: A language and environment for statistical computing: R foundation for statistical computing.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34 (17). Retrieved from https://psycnet.apa.org/record/1972-04809-001
Sari, H. I., & Raborn, A. (2018). What Information Works Best?: A Comparison of Routing Methods. Applied psychological measurement, 42(6), 499-515. https://doi.org/10.1177/0146621617752990
Sari, H.I., Yahsi Sari, H., & Huggins Manley, A.C. (2016). Computer adaptive multistage testing: Practical issues, challenges and principles. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 388-406. https://doi.org/10.21031/epod.280183
Weiss, D. J. (1982). Improving Measurement Quality and Efficiency with Adaptive Testing. Applied Psychological Measurement, 6(4), 473–492. https://doi.org/10.1177/014662168200600408
Weiss, D. J. (1983). Latent trait theory and adaptive testing. In Weiss D. J. (Ed.), New horizons in testing (pp. 5-7). Academic Press.
Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment (Publication No. 5710) [Doctoral dissertation, University of Massachusetts Amherst]. UMass Amherst Libraries. https://scholarworks.umass.edu/dissertations_1/5710
Zenisky A., Hambleton R.K.,& Luecht R.M. (2009) Multistage Testing: Issues, Designs, and Research. In: van der Linden W., Glas C. (eds) Elements of Adaptive Testing. Springer.
Zurovac, J., Cook, T. D., Deke, J., Finucane, M. M., Chaplin, D., Coopersmith, J. S., ... & Forrow, L. V. (2021). Absolute and Relative Bias in Eight Common Observational Study Designs: Evidence from a Meta-analysis. https://arxiv.org/ftp/arxiv/papers/2111/2111.06941.pdf

There are 25 citations in total.

Details

Primary Language	English
Subjects	Testing, Assessment and Psychometrics (Other)
Journal Section	Articles
Authors	Hasibe Yahsi Sarı 0000-0002-0451-6034 Hülya Kelecioğlu 0000-0002-0741-9934
Publication Date	September 30, 2023
Acceptance Date	December 21, 2022
Published in Issue	Year 2023 Volume: 14 Issue: 3

Cite

APA	Yahsi Sarı, H., & Kelecioğlu, H. (2023). Ability Estimation with Polytomous Items in Computerized Multistage Tests. Journal of Measurement and Evaluation in Education and Psychology, 14(3), 171-184. https://doi.org/10.21031/epod.1056079

Download Cover Image

Article Files

Full Text