Research Article
BibTex RIS Cite

Investigation of Item Properties Affecting the Difficulty Index of PISA 2015 Reading Literacy Items

Year 2023, Volume: 11 Issue: 3, 567 - 579, 23.07.2023
https://doi.org/10.16916/aded.1212049

Abstract

Makaleni The aim of this research is to investigate the item properties that affect the parameters of the reading literacy items. For this purpose, item format, cognitive domain and the effects of the interaction of these two variables on item difficulty are examined. The study group of the research consists of 2418 students who responded to the reading subtest in the PISA 2015 Turkey sample. The analyzes of the study are carried out with Explanatory IRT models, which is a multi-level method. According to the results, open-response items are significantly more difficult than multiple-choice items, and the items in the integrate cognitive domain are significantly more difficult than the items in the access and evaluate domain. In addition, constructed-response items are more suitable for measuring meta-cognitive domains, while selected-items are better for measuring achievements in sub-cognitive domains.

References

  • Ackerman, T. A., & Smith, P. L. (1988). A comparison of the information provided by essay, multiple-choice, and free-response writing tests. Applied Psychological Measurement, 12(2), 117-128.
  • Anderson, L., Krathwohl, D., Airasian, P., Cruikshank, K., Mayer, R., Pintrich, P., et al. (2000). A taxonomy for learning, teaching, and assessing: A revision of bloom's taxonomy of educational objectives. Abridged Edition: Allyn & Bacon.
  • Bacon, D. R. (2003). Assessing learning outcomes: A comparison of multiple-choice and short-answer questions in a marketing context. Journal of Marketing Education, 25(1), 31-36.
  • Badger, E., & Thomas, B. (1991). Open-ended questions in reading. Practical Assessment, Research, and Evaluation, 3(1), 4.
  • Bates, D., Maechler, M., Bokler, B., & Walker, S. (2014). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48.
  • Becker, W. E., & Johnston, C. (1999). The relationship between multiple choice and essay response questions in assessing economics understanding. Economic Record, 75(4), 348-357.
  • Beller, M., & Gafni, N. (2000). Can item format (multiple choice vs. open-ended) account for gender differences in mathematics achievement? Sex Roles, 42(1), 1-21.
  • Bennett, R. E., Ward, W. C., Rock, D. A., & LaHart, C. (1990). Toward a framework for constructed-response items.
  • Bennett, R. E., Rock, D. A., Braun, H. I., Frye, D., Spohrer, J. C., & Soloway, E. (1990). The relationship of expert-system scored constrained free-response items to multiple-choice and open-ended items. Applied Psychological Measurement, 14(2), 151-162.
  • Ben-Simon, A., Budescu, D. V., & Nevo, B. (1997). A comparative study of measures of partial knowledge in multiple-choice tests. Applied Psychological Measurement, 21(1), 65-88.
  • Bible, L., Simkin, M. G., & Kuechler, W. L. (2008). Using multiple-choice tests to evaluate students' understanding of accounting. Accounting Education: An International Journal, 17(S1), S55-S68.
  • Birgili, B. (2014). Open ended questions as an alternative to multiple choice: Dilemma in Turkish examination system (Master's thesis). Middle East Technical University Institute of Social Sciences, Ankara.
  • Bloom BS, Krathwohl DR, & Masia BB (1956). Taxonomy of educational objectives: The classification of educational goals. New York: McKay.
  • Brown, G. A., Bull, J., & Pendlebury, M. (2013). Assessing student learning in higher education. Routledge.
  • Bush, M. (2001). A multiple choice test that rewards partial knowledge. Journal of Further and Higher Education, 25(2), 157-163.
  • Coe, R., Waring, M., Hedges, L., & Day Ashley, L. (Eds.) (2021). Research methods and methodologies in education. SAGE Publications. https://us.sagepub.com/en-us/nam/research-methods-and-methodologies-in-education/book271175#description
  • Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. Toronto: Holt, RineHart, and Winston Inc.
  • Cruickshank, D. L., Bainer, D. L., & Metcalf, K. K (1995). The act of teaching. New York: McGraw-Hill.
  • De Boeck, P., & Wilson, M. (2004). Explanatory item response models: a generalized linear and nonlinear approach. New York, NY: Springer Press.
  • De Boeck, P. (2008). Random item IRT models. Psychometrika, 73(4), 533-559.
  • Demir, E. (2010). Uluslararası öğrenci değerlendirme programı (PISA) bilişsel alan testlerinde yer alan soru tiplerine göre Türkiye’de öğrenci başarıları (Yayımlanmamış yüksek lisans tezi). Hacettepe Üniversitesi Sosyal Bilimler Enstitüsü, Ankara.
  • Dufresne, R. J., Leonard, W. J., & Gerace, W. J. (2002). Marking sense of students' answers to multiple-choice questions. The Physics Teacher, 40(3), 174-180.
  • Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37(6), 359–374.
  • Fulcher, G., & Davidson, F. (2007). Language testing and assessment. London and New York: Routledge.
  • Gardner, R.C., Tremblay, P.F. & Masgoret, A.M. (1997). Towards a Full Model of Second Language Learning: An Empirical Investigation. The Modern Language Journal, 81(3), 344-362.
  • Geer, J. G. (1988). What do open-ended questions measure? Public Opinion Quarterly, 52(3), 365–367.
  • Hancock, G. R. (1994). Cognitive complexity and the comparability of multiple-choice and constructed-response test formats. The Journal of Experimental Education, 62(2), 143-157.
  • Haynie, W. (1994). Effect of Multiple – Choice & short answer test on delayed retention learning. Journal of Technology Education, 6(1). 32-44
  • Hmelo-Silver, C. E. (2004). Problem-based learning: What and how do students learn? Educational Psychology Review, 16(3), 235-266.
  • Hurd, A. W. (1932). Comparisons of short answer and multiple choice tests covering identical subject content. The Journal of Educational Research, 26(1), 28-30.
  • Jennings, S., & Bush, M. (2006). A comparison of conventional and liberal (free-choice) multiple-choice tests. Practical Assessment, Research, and Evaluation, 11(1), 8.
  • Kufahi, T.(2003). Measurement & evaluation in special education. Amman: Dar Almasira.
  • Lee, H.-S., Liu, O. L. ve Linn, M. C. (2011). Validating measurement of knowledge integration in science using multiple-choice and explanation items. Applied Measurement in Education, 24(2), 115–136.
  • Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
  • Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative value of multiple-choice, constructed response, and examinee-selected items on two achievement tests. Journal of Educational Measurement, 31(3), 234–250.
  • Martinez, M. E. (1999). Cognition and the question of test item format. Educational Psychologist, 34(4), 207-218.
  • Melovitz Vasan, C. A., DeFouw, D. O., Holland, B. K., & Vasan, N. S. (2018). Analysis of testing with multiple choice versus open‐ended questions: Outcome‐based observations in an anatomy course. Anatomical Sciences Education, 11(3), 254-261.
  • Organisation for Economic Co-operation and Development [OECD]. (2017a). PISA 2015 technical report. Paris, France: OECD. Retrieved from https://www.oecd.org/pisa/data/2015-technical-report/.
  • Organisation for Economic Co-operation and development [OECD]. (2017b). PISA 2015 technical report. Paris, France: OECD. Retrieved from https://www.oecd.org/pisa/data/2015-technical-report/.
  • Ormell, C. P. (1974). Bloom's taxonomy and the objectives of education. Educational Research, 17, 3-18.
  • Osterlind, S. J. (1998). Constructing test items: Multiple-choice, constructed-response, performance, and other formats. Dordrecht, Netherlands: Kluwer Academic.
  • Robbins, A. (1995). İçindeki devi uyandır. (Çev. B. Çorakçı Dişbudak). İstanbul: İnkılap Yayınevi.
  • Ruch, G. M., & Stoddard, G. D. (1925). Comparative reliabilities of five types of objective examinations. Journal of Educational Psychology, 16(2), 89.
  • Pepple, D. J., Young, L. E., & Carroll, R. G. (2010). A comparison of student performance in multiple-choice and long essay questions in the MBBS stage I physiology examination at the University of the West Indies (Mona Campus). American Journal of Physiology - Advances in Physiology Education, 34(2), 86–89.
  • Phipps, S. D., & Brackbill, M. L. (2009). Relationship between assessment item format and item performance characteristics. American Journal of Pharmaceutical Education, 73(8).
  • Pollack, J. M., Rock, D. A., & Jenkins, F. (1992). Advantages and disadvantages of constructed-response item formats in large- scale surveys. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco.
  • Powell, J. L., & Gillespie, C. (1990). Assessment: all tests are not created equally. Paper presented at the Annual Meeting of the American Reading Forum, Sarasota.
  • Traub, R. E., & Fisher, C. W. (1997). On the equivalence of constructed-response and multiple-choice tests. Applied Psychological Measurement, 1(3), 355-369.
  • Van den Bergh, H. (1990). On the construct validity of multiple-choice items for reading comprehension. Applied Psychological Measurement, 14(1), 1-12.
  • Ventouras, E., Triantis, D., Tsiakas, P., & Stergiopoulos, C. (2010). Comparison of examination methods based on multiple-choice questions and constructed-response questions using personal computers. Computers & Education, 54(2), 455-461.
  • Wainer, H., & Thissen, D. (1993). Combining multiple-choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education, 6(2), 103-118.
  • Walstad, W. B., & Becker, W. E. (1994). Achievement differences on multiple-choice and essay tests in economics. The American Economic Review, 84(2), 193–196.
  • Walstad, W. B. (1998). Multiple choice tests for the economics course. In W. B. Walstad & P. Saunder (Eds.). In teaching undergraduate economics: A handbook for instructors (pp. 287-304), New York: McGraw- Hill.
  • Zeidner, M. (1987). Essay versus multiple-choice type classroom exams: the student’s perspective. The Journal of Educational Research, 80(6), 352-358.

PISA 2015 Okuma Becerisi Maddelerinin Güçlük İndeksini Etkileyen Madde Özelliklerinin İncelenmesi

Year 2023, Volume: 11 Issue: 3, 567 - 579, 23.07.2023
https://doi.org/10.16916/aded.1212049

Abstract

Bu çalışmanın amacı okuma becerisi maddelerinin güçlük indeksini etkileyen madde özelliklerini belirlemektir. Bu amaç doğrultusunda madde formatı, madde bilişsel alan düzeyi ve bu iki değişkene ait etkileşimin madde güçlüğü üzerindeki etkileri incelenmiştir. Araştırmanın çalışma grubunu PISA 2015 Türkiye uygulamasında okuma becerisi alt testine yanıt veren 2418 öğrenci oluşturmaktadır. Çalışmanın analizleri çok seviyeli bir yöntem olan Açıklayıcı MTK modelleri ile yürütülmüştür. Elde edilen sonuçlar açık uçlu maddelerin çoktan seçmeli maddelere göre, anlama ve yorumlama bilişsel alanında yer alan maddelerin ise bilgi ve değerlendirme basamağında yer alan maddelere göre anlamlı derecede daha zor olduğunu göstermektedir. Madde formatı ve madde bilişsel alan kesişimi incelendiğinde ise, bilişsel alanı anlama ve yorumlama olan maddelerinin açık uçlu sorulmasının maddeleri kolaylaştıracağı, bilgi basamağında yer alan maddelerin ise açık uçlu sorulmasının maddeleri zorlaştıracağı saptanmıştır.

References

  • Ackerman, T. A., & Smith, P. L. (1988). A comparison of the information provided by essay, multiple-choice, and free-response writing tests. Applied Psychological Measurement, 12(2), 117-128.
  • Anderson, L., Krathwohl, D., Airasian, P., Cruikshank, K., Mayer, R., Pintrich, P., et al. (2000). A taxonomy for learning, teaching, and assessing: A revision of bloom's taxonomy of educational objectives. Abridged Edition: Allyn & Bacon.
  • Bacon, D. R. (2003). Assessing learning outcomes: A comparison of multiple-choice and short-answer questions in a marketing context. Journal of Marketing Education, 25(1), 31-36.
  • Badger, E., & Thomas, B. (1991). Open-ended questions in reading. Practical Assessment, Research, and Evaluation, 3(1), 4.
  • Bates, D., Maechler, M., Bokler, B., & Walker, S. (2014). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48.
  • Becker, W. E., & Johnston, C. (1999). The relationship between multiple choice and essay response questions in assessing economics understanding. Economic Record, 75(4), 348-357.
  • Beller, M., & Gafni, N. (2000). Can item format (multiple choice vs. open-ended) account for gender differences in mathematics achievement? Sex Roles, 42(1), 1-21.
  • Bennett, R. E., Ward, W. C., Rock, D. A., & LaHart, C. (1990). Toward a framework for constructed-response items.
  • Bennett, R. E., Rock, D. A., Braun, H. I., Frye, D., Spohrer, J. C., & Soloway, E. (1990). The relationship of expert-system scored constrained free-response items to multiple-choice and open-ended items. Applied Psychological Measurement, 14(2), 151-162.
  • Ben-Simon, A., Budescu, D. V., & Nevo, B. (1997). A comparative study of measures of partial knowledge in multiple-choice tests. Applied Psychological Measurement, 21(1), 65-88.
  • Bible, L., Simkin, M. G., & Kuechler, W. L. (2008). Using multiple-choice tests to evaluate students' understanding of accounting. Accounting Education: An International Journal, 17(S1), S55-S68.
  • Birgili, B. (2014). Open ended questions as an alternative to multiple choice: Dilemma in Turkish examination system (Master's thesis). Middle East Technical University Institute of Social Sciences, Ankara.
  • Bloom BS, Krathwohl DR, & Masia BB (1956). Taxonomy of educational objectives: The classification of educational goals. New York: McKay.
  • Brown, G. A., Bull, J., & Pendlebury, M. (2013). Assessing student learning in higher education. Routledge.
  • Bush, M. (2001). A multiple choice test that rewards partial knowledge. Journal of Further and Higher Education, 25(2), 157-163.
  • Coe, R., Waring, M., Hedges, L., & Day Ashley, L. (Eds.) (2021). Research methods and methodologies in education. SAGE Publications. https://us.sagepub.com/en-us/nam/research-methods-and-methodologies-in-education/book271175#description
  • Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. Toronto: Holt, RineHart, and Winston Inc.
  • Cruickshank, D. L., Bainer, D. L., & Metcalf, K. K (1995). The act of teaching. New York: McGraw-Hill.
  • De Boeck, P., & Wilson, M. (2004). Explanatory item response models: a generalized linear and nonlinear approach. New York, NY: Springer Press.
  • De Boeck, P. (2008). Random item IRT models. Psychometrika, 73(4), 533-559.
  • Demir, E. (2010). Uluslararası öğrenci değerlendirme programı (PISA) bilişsel alan testlerinde yer alan soru tiplerine göre Türkiye’de öğrenci başarıları (Yayımlanmamış yüksek lisans tezi). Hacettepe Üniversitesi Sosyal Bilimler Enstitüsü, Ankara.
  • Dufresne, R. J., Leonard, W. J., & Gerace, W. J. (2002). Marking sense of students' answers to multiple-choice questions. The Physics Teacher, 40(3), 174-180.
  • Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37(6), 359–374.
  • Fulcher, G., & Davidson, F. (2007). Language testing and assessment. London and New York: Routledge.
  • Gardner, R.C., Tremblay, P.F. & Masgoret, A.M. (1997). Towards a Full Model of Second Language Learning: An Empirical Investigation. The Modern Language Journal, 81(3), 344-362.
  • Geer, J. G. (1988). What do open-ended questions measure? Public Opinion Quarterly, 52(3), 365–367.
  • Hancock, G. R. (1994). Cognitive complexity and the comparability of multiple-choice and constructed-response test formats. The Journal of Experimental Education, 62(2), 143-157.
  • Haynie, W. (1994). Effect of Multiple – Choice & short answer test on delayed retention learning. Journal of Technology Education, 6(1). 32-44
  • Hmelo-Silver, C. E. (2004). Problem-based learning: What and how do students learn? Educational Psychology Review, 16(3), 235-266.
  • Hurd, A. W. (1932). Comparisons of short answer and multiple choice tests covering identical subject content. The Journal of Educational Research, 26(1), 28-30.
  • Jennings, S., & Bush, M. (2006). A comparison of conventional and liberal (free-choice) multiple-choice tests. Practical Assessment, Research, and Evaluation, 11(1), 8.
  • Kufahi, T.(2003). Measurement & evaluation in special education. Amman: Dar Almasira.
  • Lee, H.-S., Liu, O. L. ve Linn, M. C. (2011). Validating measurement of knowledge integration in science using multiple-choice and explanation items. Applied Measurement in Education, 24(2), 115–136.
  • Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
  • Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative value of multiple-choice, constructed response, and examinee-selected items on two achievement tests. Journal of Educational Measurement, 31(3), 234–250.
  • Martinez, M. E. (1999). Cognition and the question of test item format. Educational Psychologist, 34(4), 207-218.
  • Melovitz Vasan, C. A., DeFouw, D. O., Holland, B. K., & Vasan, N. S. (2018). Analysis of testing with multiple choice versus open‐ended questions: Outcome‐based observations in an anatomy course. Anatomical Sciences Education, 11(3), 254-261.
  • Organisation for Economic Co-operation and Development [OECD]. (2017a). PISA 2015 technical report. Paris, France: OECD. Retrieved from https://www.oecd.org/pisa/data/2015-technical-report/.
  • Organisation for Economic Co-operation and development [OECD]. (2017b). PISA 2015 technical report. Paris, France: OECD. Retrieved from https://www.oecd.org/pisa/data/2015-technical-report/.
  • Ormell, C. P. (1974). Bloom's taxonomy and the objectives of education. Educational Research, 17, 3-18.
  • Osterlind, S. J. (1998). Constructing test items: Multiple-choice, constructed-response, performance, and other formats. Dordrecht, Netherlands: Kluwer Academic.
  • Robbins, A. (1995). İçindeki devi uyandır. (Çev. B. Çorakçı Dişbudak). İstanbul: İnkılap Yayınevi.
  • Ruch, G. M., & Stoddard, G. D. (1925). Comparative reliabilities of five types of objective examinations. Journal of Educational Psychology, 16(2), 89.
  • Pepple, D. J., Young, L. E., & Carroll, R. G. (2010). A comparison of student performance in multiple-choice and long essay questions in the MBBS stage I physiology examination at the University of the West Indies (Mona Campus). American Journal of Physiology - Advances in Physiology Education, 34(2), 86–89.
  • Phipps, S. D., & Brackbill, M. L. (2009). Relationship between assessment item format and item performance characteristics. American Journal of Pharmaceutical Education, 73(8).
  • Pollack, J. M., Rock, D. A., & Jenkins, F. (1992). Advantages and disadvantages of constructed-response item formats in large- scale surveys. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco.
  • Powell, J. L., & Gillespie, C. (1990). Assessment: all tests are not created equally. Paper presented at the Annual Meeting of the American Reading Forum, Sarasota.
  • Traub, R. E., & Fisher, C. W. (1997). On the equivalence of constructed-response and multiple-choice tests. Applied Psychological Measurement, 1(3), 355-369.
  • Van den Bergh, H. (1990). On the construct validity of multiple-choice items for reading comprehension. Applied Psychological Measurement, 14(1), 1-12.
  • Ventouras, E., Triantis, D., Tsiakas, P., & Stergiopoulos, C. (2010). Comparison of examination methods based on multiple-choice questions and constructed-response questions using personal computers. Computers & Education, 54(2), 455-461.
  • Wainer, H., & Thissen, D. (1993). Combining multiple-choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education, 6(2), 103-118.
  • Walstad, W. B., & Becker, W. E. (1994). Achievement differences on multiple-choice and essay tests in economics. The American Economic Review, 84(2), 193–196.
  • Walstad, W. B. (1998). Multiple choice tests for the economics course. In W. B. Walstad & P. Saunder (Eds.). In teaching undergraduate economics: A handbook for instructors (pp. 287-304), New York: McGraw- Hill.
  • Zeidner, M. (1987). Essay versus multiple-choice type classroom exams: the student’s perspective. The Journal of Educational Research, 80(6), 352-358.
There are 54 citations in total.

Details

Primary Language Turkish
Subjects Other Fields of Education
Journal Section Makaleler
Authors

Sinem Demirkol 0000-0002-9526-6156

Merve Ayvallı Karagöz 0000-0002-7301-0096

Publication Date July 23, 2023
Published in Issue Year 2023Volume: 11 Issue: 3

Cite

APA Demirkol, S., & Ayvallı Karagöz, M. (2023). PISA 2015 Okuma Becerisi Maddelerinin Güçlük İndeksini Etkileyen Madde Özelliklerinin İncelenmesi. Ana Dili Eğitimi Dergisi, 11(3), 567-579. https://doi.org/10.16916/aded.1212049