< Back to previous page

Project

Applications of refined psychometric models for (inter)national large-scale assessments

The present dissertation covers four studies that investigate different aspects of (inter)national large-scale assessments by extending standard psychometric models to address issues concerning (a) screening for cheating in high-stakes assessments; (b) changes in students’ ability during a test; (c) dealing with missing responses; and (d) response style patterns in student self-reported questionnaires. 

High-stake assessments can influence students or schools directly (Au, 2007) or indirectly (Amrein & Berliner, 2002), increasing the occurrence of unfair test behavior. Chapter 2 proposes a Bayesian method for screening and detecting schools with unexpected outcomes given previous school results using influential analysis in a beta inflated mean regression model. This methodology is applied to a 2016 Peruvian high- stakes large-scale assessment in reading for 4th-grade focusing on the non-Spanish speaking students. 

Students’ performance may increase or decrease during the test administration. This change can be modeled as a function of item position in case of a test booklet design with item-order manipulations. Chapter 3 uses an explanatory item response theory framework to analyze item position effects in the 2012 European Survey on Language Competences (ESLC). Consistent practice effects were found for listening but no systematic item position effects were found for reading. 

In large-scale assessments students may not be compelled to answer every test item. Chapter 4 investigates how the way these missing responses are treated may affect item calibration and ability estimation. It is shown in a simulation study that that in a context of high proportion of omitted answers, an IRTree model maintained a higher accuracy than traditional imputation methods. The re-estimated country means using the IRTree approach in the Progress in International Reading Literacy Study (PIRLS) data between 2006 and 2016 showed a strong correlation with official PIRLS results, not altering the country rankings significantly.

When answering self-report questionnaires, participants may adhere to an extreme response style in order to boost their responses (Böckenholt, 2013). Chapter 5 describes an IRTree approach to investigate such response styles and analyzes the impact of extreme response styles on self- reporting surveys of motivation towards science and mathematics applied to 8th-grade students during the 2019 cycle of the Trends in International Mathematics and Science Study (TIMSS). It is shown that there are systematic differences among students in their choice of extreme response categories but this variable is not related to student background characteristics nor to their test performance.

Date:1 Aug 2018 →  27 Oct 2021
Keywords:IRT
Disciplines:Education curriculum, Education systems, General pedagogical and educational sciences, Specialist studies in education, Other pedagogical and educational sciences
Project type:PhD project