In order to understand the impact of a teaching intervention (e.g., a course or a professional development program) on what students gained from that experience, students are often administered a test before and after that intervention, and researchers study the gains computed by subtracting the pre-test scores from the post-test. Similarly, researchers use the difference in groups’ test scores to compare performance between different groups. This way of using a difference in test scores, however, requires an assumption that test items measure the same construct (e.g, knowledge) in the same way for different points in time (e.g., pre- and post-test) or different groups of participants (e.g., pre-service and practicing teachers).
To ensure this assumption of measurement invariance (i.e., the same test items measure the same construct in the same way) is important when comparing amounts of, or gains on, a construct, because the observed difference in the scores could be due to different types of constructs being measured by the same items rather than the difference in the same target construct. For example, different groups of participants could interpret the same wording differently due to their demographics or educational background. Similarly, participants could react differently to the same content of an item depending on when the item is provided to the participants. As such, we cannot take for granted that the use of the same assessment items guarantees that a set of assessment items is measuring the same thing across different groups of participants or over time. To validly compare a measured construct across groups or time points, it is recommended that a test of measurement invariance be performed. In other words, it is important to demonstrate that the way in which items are related to a target construct (e.g., MKT-G) is equivalent across the compared populations and over time. The statistical technique used to test this invariance in our study is called multi-group confirmatory factor analysis (Brown, 2006). In this note, we present how we used the measurement invariance tests to estimate the gain of GeT students’ mathematical knowledge for teaching geometry (MKT-G) before and after taking GeT courses and how their post-test score is different from practicing teachers’ MKT-G.
By using the 17 MKT-G items developed by Herbst’s research group, we examined the participating GeT students’ MKT-G growth over the duration of the course. Also, by scaling the growth using a distribution of practicing teachers’ MKT-G scores, we approximated GeT students’ growth in terms of in-service teachers’ years of experience. An assumption in estimating a construct (here, MKT-G) by using a set of responses is that the common variance among a set of responses to items is accounted for by the construct, and the relationship between the scale of an item score and the latent construct is a linear function. The slope of the linear function, where its horizontal axis represents the level of the latent construct and the vertical axis represents item score, is the item factor loading representing the magnitude of the relationship between the item and MKT-G. The intercept of the linear function is a predicted value of the item score when the level of MKT-G is zero. Thus, the equivalence in the way in which items are related to a targeted construct between the groups can be examined by testing the equality in the structure of the construct (configural invariance), factor loadings (metric invariance), and the item intercepts (scalar invariance). We tested the equivalence of item parameters simultaneously, not only between GeT students and practicing teachers but also between GeT students’ pre-test and their post-test.
The results derived from subsequent invariance tests suggested that the relationship of the items to the measured knowledge was at least partially equivalent between GeT students’ pre- and post-test, as well as between GeT students’ and practicing teachers. Here, partial equivalence means that we were able to establish the equivalence between the groups and time points after allowing unequal item parameters (item factor loadings or item intercepts) for 9 among 17 items. As we were able to establish comparable scales, we proceeded to calculate the GeT students’ MKT-G growth and compare the growth to the practicing teachers’ MKT-G.
The comparison of the scores suggested that, on average, GeT students scored about 0.25 SD units higher on the MKT-G test after completing the GeT course, but it was still 1.04 SD units below practicing teachers’ MKT-G who took the same test. This result implies the positive association between the college geometry courses designed for future teachers and mathematical knowledge for teaching geometry in terms of the growth in the knowledge of the students who took the courses. Additionally, examining the association contributes to research methodology by showing how to establish comparable scales of knowledge gains between two different teacher populations (e.g., pre-service teachers and in-service teachers).
Reference
Brown, T. A. (2006). Confirmatory factor analysis for applied research. Guilford.


Leave a Reply
You must be logged in to post a comment.