Skip to main content
Ottawa 2024
Times are shown in your local time zone GMT

Development of an automated tool to identify Item Writing Flaws in Multiple-Choice Questions in high stakes examinations.

E Poster Presentation
Edit Your Submission
Edit

ePoster Presentation

4:35 pm

28 February 2024

These posters are not being presented live, but are available to be reviewed.

Posters

Presentation Description

João Pedro Monteiro1,2
José Miguel Pêgo1,2 and Carlos Fernando Collares3
1 Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal.
2 ICVS/3B's, PT Government Associate Laboratory, Braga, Portugal.
3 University of Minho




Background:
Multiple choice questions (MCQs) are a widely used method of assessment, due to their ability to objectively evaluate knowledge and skill. However, MCQs are subjected to flaws, whose detection is time and resources-consuming. We aimed to develop an open-access software that automatically identifies Item-Writing Flaws (IWFs) in MCQs. (1, 2) 


Summary of work:
A database of 1150 items from high-stake Portuguese exams was analyzed. Criteria established by Rush et al. (2016) was used for identification of IWF. Accuracy calculations and statistical analysis to calculate the degree of agreement, the systematic effect and the validity were made comparing both analyses. 


Results:
The developed tool successfully identified 78,0% of items with at least one flaw. Individually, the sensitivity and specificity were higher than 61,29% and 91,0% respectively with an accuracy higher than 92%. Five tests showed a substantial or an almost perfect agreement and only two tests showed a statistically significant result. 


Discussion:
The tool demonstrated good efficacy for most IWFs categories, particularly for negative stems, true/false options, and absolute terms. The overall higher specificity of the tests shows that the tool is more reliable in the identification of items without IWF than otherwise. 


Conclusions:
The tool showed promising results, with high accuracy, sensitivity and specificity, although it had certain limitations in specific types of IWFs. The software has a high reliability in identifying items with at least one IWF but a lower reliability in identifying the number of each item’s IWF. Further research and development are necessary, including the use of Artificial Intelligence tools to enhance IWFs identification. 


Take home messages:
We have shown that it is possible to automatically identify IWFs and reduce the burden of MCQ reviewing work using software solutions that could also contribute to reducing the cost and the time needed to construct excellent MCQ. 



References (maximum three) 

  1. Case SM, Swanson DB. Constructing Written Test Questions For the Basic and Clinical Sciences. Director [Internet]. 2002;27(21):1–181. Available from: http://www.nbme.org/PDF/ItemWriting_2003/2003IWGwhole.pdf 

  2. Rudner LM. Elements of Adaptive Testing [Internet]. van der Linden WJ, Glas CAW, editors. Elements of Adaptive Testing. New York, NY: Springer New York; 2010. 151– 152 p. Available from: https://link.springer.com/10.1007/978-0-387-85461-8 

  3. Rush BR, Rankin DC, White BJ. The impact of item-writing flaws and item complexity on examination item difficulty and discrimination value. BMC Med Educ [Internet]. 2016;16(1):1–10. Available from: http://dx.doi.org/10.1186/s12909-016-0773-3 

Speakers