The handling of missing binary data in language research
Languages of publication
Researchers are frequently confronted with unanswered questions or items on their questionnaires and tests, due to factors such as item difficulty, lack of testing time, or participant distraction. This paper first presents results from a poll confirming previous claims (Rietveld & van Hout, 2006; Schafer & Gra- ham, 2002) that data replacement and deletion methods are common in research. Language researchers declared that when faced with missing answers of the yes/no type (that translate into zero or one in data tables), the three most common solutions they adopt are to exclude the participant’s data from the analyses, to leave the square empty, or to fill in with zero, as for an incorrect answer. This study then examines the impact on Cronbach’s α of five types of data insertion, using simulated and actual data with various numbers of participants and missing percentages. Our analyses indicate that the three most common methods we identified among language researchers are the ones with the greatest impact n Cronbach's α coefficients; in other words, they are the least desirable solutions to the missing data problem. On the basis of our results, we make recommendations for language researchers concerning the best way to deal with missing data. Given that none of the most common simple methods works properly, we suggest that the missing data be replaced either by the item’s mean or by the participants’ overall mean to provide a better, more accurate image of the instrument’s internal consistency.
- Universite du Quebec- Teluq, Canada, firstname.lastname@example.org
- Universite de Montreal, Canada, email@example.com
- Universiteit Utrecht, the Netherlands, firstname.lastname@example.org
- Uniwersytet Jagieloński Polska , email@example.com
Publication order reference