Test Mantel–Haenszel oraz modelowanie IRT jako narzędzia wykrywania DIF i opisu jego wielkości na przykładzie zadań ocenianych dychotomicznie

Kondratek, Bartosz; Grudniewska, Magdalena

Article details

Journal

Edukacja

2013 | 2(122) | 34–55

Article title

Test Mantel–Haenszel oraz modelowanie IRT jako narzędzia wykrywania DIF i opisu jego wielkości na przykładzie zadań ocenianych dychotomicznie

Authors

Bartosz Kondratek , Magdalena Grudniewska

Title variants

EN

Comparison of Mantel–Haenszel test with IRT procedures for DIF detection and effect size estimation for dichotomous items

Languages of publication

PL

Abstracts

PL

Artykuł porównuje dwie metody wykorzystywane do identyfikacji zróżnicowanego funkcjonowania zadań (DIF) ocenianych dychotomicznie: nieparametryczne rozwiązanie opierające się na statystyce Mantela–Haenszela (MH) oraz podejście bazujące na teście ilorazu funkcji wiarygodności. Porównanie przeprowadzono na gruncie teoretycznym i za pomocą symulacji. Wyniki symulacji potwierdziły przypuszczenie, że podejście opierające się na statystyce MH jest bardziej czułe na jednorodne efekty DIF, jednak traci moc, gdy wielkość DIF zmienia się w zależności od poziomu zmiennej ukrytej mierzonej testem. Oprócz mocy statystycznej analizowano również specyficzne miary wielkości efektu DIF stosowane w obu metodach: miarę MH D – DIF, wykorzystywaną standardowo przez Educational Testing Service do klasyfikacji wielkości DIF, oraz różne miary P – DIF określone na metryce łatwości zadania.

EN

The article compares two methods used to detect differential item functioning (DIF) of dichotomously scored items: a nonparametric solution based on the Mantel–Haenszel procedure (MH) and a parametric IRT approach with a likelihood ratio test. A Monte Carlo experiment was performed in order to evaluate performance of both statistics in various conditions of DIF uniformity. Results confirmed the theoretical prediction that the MH test has greater statistical power in detecting uniform DIF than the likelihood ratio test and less power than the LR test in cases of non-uniform DIF. Apart of examining statistical power of the test, specific measures of DIF effect size were compared: MH D–DIF and three measures of P–DIF expressed on the item easiness scale.

Keywords

PL

zróżnicowane funkcjonowanie zadań DIF test Mantel–Haenszel IRT

EN

Differential item functioning Mantel–Haenszel test item response theory

Publisher

Instytut Badań Edukacyjnych

Journal

Edukacja

Year

2013

Issue

2(122)

Pages

34–55

Physical description

Dates

issued

2013-06-30

Contributors

author

Bartosz Kondratek

Instytut Badań Edukacyjnych

author

Magdalena Grudniewska

Instytut Badań Edukacyjnych

References

Agresti, A. (2002). Categorical data analysis. New Jersey: John Wiley & Sons.
Dorans, N. J. i Holland, P. W. (1993). DIF detection and description: Mantel-Haenshel and standardization. W: P. W. Holland i H. Wainer (red.), Differential item functioning (s. 35–66). Hillsdale, NJ: Lawrence Earlbaum.
Glas, C. A. (2010). Preliminary manual of the software program Multidimensional Item Response Theory (MIRT). Enschede: University of Twente.
Kondratek B. (2012). Bias of IRT observed score equating under NEAT design. Plakat naukowy zaprezentowany na konferencji Modern Modelling Methods, Storrs, Connecticut.
Lord, F. M. (1983). Statistical bias in maximum likelihood estimators of item parameters. Psychometrika, 48(3), 425–435.
Lord, F. M. i Novick, M. R. (1968). Statistical theories of mental test scores. Reading, Massachusetts: Addison-Wesley.
Mantel, N. i Haenshel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22(4), 719–748
Monahan, P. O., McHorney, C. A., Stump, T. E. i Perkins, A. J. (2007). Odds ratio, delta, ETS classification, and standardization measures of DIF magnitude for binary logistic regression. Journal of Educational and Behavioral Statistics, 32(1), 92–109.
Penfield, R. D. i Camilli, G. (2007). Differential Item Functioning and item bias. W: C. R. Rao i S. Sinharay (red.), Handbook of statistics, Vol. 26. Psychometrics (s. 125–167). New York, NY: Elsevier.
Radhakrishna, S. (1965). Combination of results from several 2 × 2 contingency tables. Biometrics, 21(1), 86–98.
Swaminathan, H. i Rogers J. H. (1990). Detecting Differential Item Functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361–370.
Thissen, D., Steinberg, L. i Wainer, H. (1993). Detection of Differential Item Functioning using the parameters of item response models. W: P. W. Holland i H. Wainer (red.), Differential Item Functioning (s. 67–113). Hillsdale, NJ: Lawrence Earlbaum.
Wainer, H. (1993). Model-based standardized measurement of an items differential impact. W: P. W. Holland i H. Wainer (red.), Differential Item Functioning (s. 255–276). Hillsdale, NJ: Lawrence Earlbaum.
Woolf, B. (1955). On estimating the relation between blood group and disease. Annals of Human Genetics, 19(4), 251–253.
Zieky, M. (1993). Practical questions in the use of DIF statistics in test development. W: P. W. Holland i H. Wainer (red.), Differential Item Functioning (s. 337–348). Hillsdale, NJ: Lawrence Earlbaum.
Zieky, M. (2003). A DIF primer. Princeton, NJ: ETS.

Notes

http://www.edukacja.ibe.edu.pl/images/numery/2013/2-3-kondratek-grudniewska-test-mantel-haenshel.pdf

Document Type

Publication order reference

Identifiers

ISSN

0239-6858

YADDA identifier

bwmeta1.element.desklight-9c96d78c-0f8e-4319-a017-0bfd0ab2a5e8

Article details

Journal

Edukacja

Article title

Test Mantel–Haenszel oraz modelowanie IRT jako narzędzia wykrywania DIF i opisu jego wielkości na przykładzie zadań ocenianych dychotomicznie

Authors

Title variants

Languages of publication

Abstracts

Keywords

Publisher

Journal

Year

Issue

Pages

Physical description

Dates

Contributors

References

Notes

Document Type

Publication order reference

Identifiers

YADDA identifier