Volume 22 (2026) - In progress Download Cover Page

Construct Validity of Computer-Based Physics Test (CBPsyT) Through Confirmatory and Item Response Theory: A Complementary Approach to Test Development

Article Number: e2026054  |  Available Online: April 2026  |  DOI: 10.22521/edupij.2026.22.54

Godelfridus Hadung Lamanepa , Edi Istiyono , Raden Rosnawati , Ayen Arsisari , Fitriyani Hali

Abstract

Background/purpose. This study aims to validate the computer-based physics test (CBPsyT) construct through confirmatory factor analysis (CFA) to ensure that CBPsyT is suitable for its intended use. However, CFA studies are based on weak theoretical perspectives, a lack of testing of alternative theoretical views, or insufficient evidence for construct validity, so item response theory (IRT) analysis needs to be added to this study.

Materials/methods. The research is exploratory and includes a quantitative approach. We developed CBPsyT according to test development procedures consisting of twelve steps and administered it to 516 students. Construct validity analysis based on the CFA and IRT perspectives using the R program.

Results. The relationship between indicators and latent constructs, with multiple representations in CBPsyT, is indicated by indicator loadings> 0.70, which denote a strong relationship between the two. Internal consistency is demonstrated by a composite reliability > 0.80, and CFA model fit values indicate that the hypothesized model is consistent with the empirical data. The IRT results add that discrimination and difficulty scores range from -2 to +2. The test information function indicated that the instrument was reliable for ability (Theta) scores ranging from -3 to +3.

Conclusion. The test quality profile was complete when CFA and IRT were combined. To the best of our knowledge, this article provides practical information on the psychometric characteristics of CBPsyT and guides new researchers in demonstrating construct validity.

Keywords: Construct validity, CFA, IRT, CBPsyT, multiple representation test, physics construct

References

Akaike, H. (1974). A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1109/TAC.1974.1100705

Allen, M. j., & Yen, W. M. (1979). Introduction to Measurement. In Measurement and Instrumentation Principles. Brooks/Cole Publishing. https://doi.org/10.1016/b978-075065081-6/50002-3

Arifin, W. N., & Yusoff, M. S. B. (2016). Confirmatory Factor Analysis of the Universiti Sains Malaysia Emotional Quotient Inventory Among Medical Students in Malaysia. SAGE Open, 6(2). https://doi.org/10.1177/2158244016650240

Astuti, N. D., Hajaroh, M., Prihatni, Y., Setiawan, A., Setiawati, F. A., & Retnawati, H. (2024). Comparison of KMO Results, Eigen Value, Reliability, and Standard Error of Measurement: Original & Rescaling Through Summated Rating Scaling. Jurnal Pengukuran Psikologi Dan Pendidikan Indonesia, 13(2), 199–217. https://doi.org/10.15408/jp3i.v13i2.36684

Bean, G. J., & Bowen, N. K. (2021). Item Response Theory and Confirmatory Factor Analysis: Complementary Approaches for Scale Development. Journal of Evidence-Based Social Work (United States), 18(6), 597–618. https://doi.org/10.1080/26408066.2021.1906813

Beavers, A. S., Lounsbury, J. W., Richards, J. K., Huck, S. W., Skolits, G. J., & Esquivel, S. L. (2013). Practical considerations for using exploratory factor analysis in educational research. Practical Assessment, Research and Evaluation, 18(6), 1–13.

Carpenter, S. K., Witherby, A. E., & Tauber, S. K. (2020). On students’ (mis)judgments of learning and teaching effectiveness. Journal of Applied Research in Memory and Cognition, 9(2), 137–151. https://doi.org/10.1016/j.jarmac.2019.12.009

Cohen R. J., & Swerdlik M. E. (2018). Psychological Testing and Assessment: An Introduction to Tests and Measurement (7th Edition). McGraw-Hill.

de Jong, T., & van der Meij, J. (2012). Learning with Multiple Representations. Encyclopedia of the Sciences of Learning, June 2016, 2026–2029. https://doi.org/10.1007/978-1-4419-1428-6_485

DeMars, C. (2008). Scoring Multiple Choice Items: A Comparison of IRT and Classical Polytomous and Dichotomous Methods. Annual Meeting of the National Council of …. https://www.jmu.edu/assessment/CED NCME Paper 08.pdf

DiStefano, C., & Hess, B. (2005). Using confirmatory factor analysis for construct validation: An empirical review. Journal of Psychoeducational Assessment, 23(3), 225–241. https://doi.org/10.1177/073428290502300303

Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. In Item Response Theory for Psychologists (1st Editio). L. Erlbaum Associates, Mahwah, N.J. https://doi.org/10.4324/9781410605269

Faller, H., Kohlmann, T., Zwingmann, C., & Maurischat, C. (2006). Exploratory and confirmatory factor analysis. Rehabilitation, 45(4), 243–248. https://doi.org/10.1055/s-2006-940029

Garcia, E., Aryal, S., Spence-Almaguer, E., Rohr, D., & Walters, S. T. (2018). Use of the IRT Model to Validate Test Items from a Technology Assisted Health Coaching Program. Open Journal of Statistics, 08(03), 519–532. https://doi.org/10.4236/ojs.2018.83034

Ghazali, N., & Nordin, M. S. (2019). Measuring meaningful learning experience: Confirmatory factor analysis. International Journal of Innovation, Creativity and Change, 9(12), 283–296.

Hair, J. F., Hult, G. T. M., Ringle, C. M., & Rstedt, M. S. (2013). A primer on partial least squares structural equation modeling (PLS-SEM). In Sage (Vol. 46, Issues 1–2). https://doi.org/10.1016/j.lrp.2013.01.002

Hambleton, R. K., & Jones, R. W. (1993a). Comparison of Classical Test Theory and Item Response Theory and Their Applications to Test Development. 38–47.

Hambleton, R. K., & Jones, R. W. (1993b). Comparison of Classical Test Theory and Item Response Theory and Their Applications to Test Development. Educational Measurement: Issues and Practice, 12(3), 38–47. https://doi.org/10.1111/j.1745-3992.1993.tb00543.x

Hambleton, R. K., & Slater, S. C. (1997). Item response theory models and testing practices: Current international status and future directions. European Journal of Psychological Assessment, 13(1), 21–28. https://doi.org/10.1027/1015-5759.13.1.21

Hambleton, R. K., Swaminathan, H., & Rogers, D. J. (1992). Fundamentals of item response theory. In Choice Reviews Online (Vol. 29, Issue 07). Sage Publications Inc. https://doi.org/10.5860/choice.29-4185

Herwin, & Nurhayati, R. (2021). Measuring students’ curiosity character using confirmatory factor analysis. European Journal of Educational Research, 10(2), 773–783. https://doi.org/10.12973/EU-JER.10.2.773

Husain, H., & Aziz, H. (2022). Exploratory Factor Analysis (Efa) and Confirmatory Factor Analysis (Cfa) To Measure the Validity and Reliability Constructs of Historical Thinking Skills, Tpack and Application of Historical Thinking Skills. International Journal of Education, Psychology and Counseling, 7(46), 608–623. https://doi.org/10.35631/ijepc.746046

Jiang, G., Tan, X., Wang, H., Xu, M., & Wu, X. (2023). Exploratory and confirmatory factor analyses identify three structural dimensions for measuring physical function in community-dwelling older adults. PeerJ, 11. https://doi.org/10.7717/peerj.15182

Kamata, A., & Bauer, D. J. (2008). A note on the relation between factor analytic and item response theory models. Structural Equation Modeling, 15(1), 136–153. https://doi.org/10.1080/10705510701758406

Kang, C., Huang, J., Liu, Y., & Yin, H. (2025). Development and validation of a generic self-assessment scale for K-12 teachers as feedback givers: Insights from item response theory and factor analysis. Humanities and Social Sciences Communications, 12(1). https://doi.org/10.1057/s41599-025-04927-4

Kim, E. S., & Yoon, M. (2011). Testing measurement invariance: A comparison of multiple-group categorical CFA and IRT. Structural Equation Modeling, 18(2), 212–228. https://doi.org/10.1080/10705511.2011.557337

Kyriazos, T. A., & Stalikas, A. (2018). Applied Psychometrics: The Steps of Scale Development and Standardization Process. Psychology, 09(11), 2531–2560. https://doi.org/10.4236/psych.2018.911145

Lauwaert, P. (2023). On Validity. Studies in Applied Linguistics and TESOL, 23(1), 18–36. https://doi.org/10.52214/salt.v23i1.11804

Lord, F. M. (1953). The relation of test score to the trait underlying the test. Educational and Psychological Measurement, 13(4), 517–549. https://doi.org/10.1177/001316445301300401

Lovric, M. (2011). International Encyclopedia of Statistical Science. International Encyclopedia of Statistical Science, March. https://doi.org/10.1007/978-3-642-04898-2

Malone, K. L., Boone, W. J., Stammen, A., Schuchardt, A., Ding, L., & Sabree, Z. (2021). Construction and Evaluation of an Instrumentto Measure High School Students Biological Content Knowledge. Eurasia Journal of Mathematics, Science and Technology Education, 17(12). https://doi.org/10.29333/EJMSTE/11376

Min, S., & Aryadoust, V. (2021). A systematic review of item response theory in language assessment: Implications for the dimensionality of language ability. Studies in Educational Evaluation, 68(December 2020), 100963. https://doi.org/10.1016/j.stueduc.2020.100963

Odoi, B., Twumasi-Ankrah, S., Samita, S., & Al-Hassan, S. (2022). The Efficiency of Bartlett’s Test using Different forms of Residuals for Testing Homogeneity of Variance in Single and Factorial Experiments-A Simulation Study. Scientific African, 17, e01323. https://doi.org/10.1016/j.sciaf.2022.e01323

Ofosu, E. K., Owusu-darko, I., & Abubakar, G. A. (2020). Effect of Multiple Representation – Based Instructions ( MR-BI ) on SHS Students' Ability to Solve Problems on Linear Functions and Their Applications. International Journal of Research and Scientific Innovation (IJRSI), 7(9), 234–239.

Ohiri, S. C., Ihebom, D., & Nnennaya, C. (2024). Psychometric Properties of a Test: An Overview. International Journal of Research Publication and Reviews, 5(2), 2217–2224. https://doi.org/10.55248/gengpi.5.0224.0539

Orcan, F. (2018). Exploratory and Confirmatory Factor Analysis: Which One to Use First? Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 9(4), 414–421. https://doi.org/10.21031/epod.394323

Phanniphong, K., & Na-Nan, K. (2025). Development and validation of a factor analysis-validated comprehensive scale for measuring innovative work behavior. Sustainable Futures, 9(May 2023). https://doi.org/10.1016/j.sftr.2025.100704

Plucker, J. A. (2003). Exploratory and Confirmatory Factor Analysis in Gifted Education: Examples with Self-Concept Data. Journal for the Education of the Gifted, 27(1), 20–35. https://doi.org/10.1177/016235320302700103

Said, H., Badru, B. B., & Shahid, M. (2011). Confirmatory Factor Analysis (Cfa) for testing validity and reliability instrument in the study of education. Australian Journal of Basic and Applied Sciences, 5(12), 1098–1103.

Stefanel, A. (2019). Graph in Physics Education: From Representation to Conceptual Understanding. Mathematics in Physics Education, 195–231. https://doi.org/10.1007/978-3-030-04627-9_9

Svensson, K., & Campos, E. (2022). Comparison of two semiotic perspectives: How do students use representations in physics? Physical Review Physics Education Research, 18(2), 20120. https://doi.org/10.1103/PhysRevPhysEducRes.18.020120

Taherdoost, H. (2018). Validity and Reliability of the Research Instrument; How to Test the Validation of a Questionnaire/Survey in a Research. SSRN Electronic Journal, September. https://doi.org/10.2139/ssrn.3205040

Thissen, D. (2022). Latent Variable Estimation in Factor Analysis and Item Response Theory. Chinese/English Journal of Educational Measurement and Evaluation, 3(3). https://doi.org/10.59863/optz4045

Vorapongsathorn, T., Taejaroenkul, S., & Viwatwongkasem, C. (2004). A Comparison of Type I Error and Power of Bartlett’s Test, Levene’s Test and Cochran’s Test under Violation of Assumptions. Research Design & Statistics, 26(4), 537–547.

Walton, D. M., Nazari, G., Bobos, P., & MacDermid, J. C. (2023). Exploratory and confirmatory factor analysis of the new region-generic version of Fremantle Body Awareness—General Questionnaire. PLoS ONE, 18(3 March), 1–14. https://doi.org/10.1371/journal.pone.0282957

Weyers, J., König, J., Santagata, R., Scheiner, T., & Kaiser, G. (2023). Measuring teacher noticing: A scoping review of standardized instruments. Teaching and Teacher Education, 122. https://doi.org/10.1016/j.tate.2022.103970

Yang, F. M., & Kao, S. T. (2014). Item response theory for measurement validity. Shanghai Archives of Psychiatry, 26(3), 171–177. https://doi.org/10.3969/j.issn.1002-0829.2014.03

Yoon, H. G., Kim, M., & Lee, E. A. (2021). Visual representation construction for collective reasoning in elementary science classrooms. Education Sciences, 11(5). https://doi.org/10.3390/educsci11050246