The Theory and Practice of Educational Data Forensics


Although the testing community has been trying to prevent test fraud through multiple practices and methods (e.g., applying strict security practices or testing protocols), test fraud is still a ubiquitous problem. Exact numbers are unknown, but self-report studies show that up to 85% of students admit to committing test fraud at least once during their school career (Lin and Wen, Higher Educ 54:85–97, 2007; Hughes and McCabe, Can J Higher Educ 36:1–12, 2006; Berkhout et al., Studie and Werk 2011. SEO Economisch Onderzoek, Amsterdam, 2011). Research on the statistical detection of test fraud, also called educational data forensics (EDF), already exists since the 1920s (Bird, School Soc 25:261–262, 1927), but the body of research started growing considerably since the 1970s (e.g., Angoff, J Am Stat Assoc 69:44–49, 1974). Nowadays, many methods and models are presented in the literature. Two of those models are the Guttman error model (Guttman, Am Soci Rev 9(2):139–150, 1944; Meijer, Appl Psychol Measur 18(4):311–314, 1994) and the log-normal response time model (Van der Linden, Psychometrika 80(3):689–706, 2006). In the first part of this chapter, both models will be discussed. In the second part of this chapter, an empirical study on the functioning of the Guttman- and response time model will be presented. In the final part of the chapter, the design, development and validation of a protocol on the use of EDF will be presented.

In Theoretical and Practical Advances in Computer-based Educational Measurement