Skip to main content
Ottawa 2024
Times are shown in your local time zone GMT

MCQs

Oral Presentation

Oral Presentation

10:00 am

28 February 2024

M214

Session Program

Daniel Nguyen1
Lambert Schuwirth
1 Flinders University



Multiple-choice questions (MCQ) are effective assessment tools, but poorly constructed they can lead to errors (so-called item-writing flaws; IWFs) and have a negative impact on pass-fail outcomes in high-stakes assessments. Therefore, item-writing guidelines exist to mitigate these errors. But empirical evidence supporting these item-writing guidelines is limited; only few studies explore how and whether IWFs produce an effect. Our study aims to investigate the effect of IWFs on student scores. Two groups of effects are studied: false-positive, where students guess a question correctly despite not having the knowledge and the opposite: false- negative effect. Two sub-studies were conducted. The first contained easily-answerable questions with IWFs likely to lead to false-negative responses. The same questions without the IWF were used as controls. The second sub-study explored false-positive effects by using nonsense items with obvious IWF cues towards the correct option, that were not answerable with any knowledge. For the first sub-study differences between the correct answers on the IWF version of the items and the correctly-formulated item was used as an outcome measure. For the second, the outcome was whether the p_correct value was higher than chance. Overall conclusions were drawn at the level of the whole test, across all items in each sub-test. Preliminary data showed 17/21 items showing a false-positive effect (binomial p =0.0044) and 11/15 items had an indication of a false-negative effect (binomial p = 0.06). We concluded that IWFs impact the scores of students in the false-positive direction. Regarding the false-negative direction the effect does not reach the 5% threshold but the numbers in the preliminary data are still too small yet to more definitely claim the absence of an effect. Our power analysis indicated an n=60 is needed. But even if only a false-positive effect exists it would be meaningful enough to warrant careful item-review procedures. 



References (maximum three) 

Downing, S. M. (2005). The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education. Advances in health sciences education, 10(2), 133-143. 

Schuwirth, L., & Pearce, J. (2014). Determining the Quality of Assessment Items in Collaborations: Aspects to Discuss to Reach Agreement Developed by the Australian Medical Assessment Collaboration. https://research.acer.edu.au/higher_education/42 

Slavko Rogan1
Eefje Luijckx1, Annika Stampfler1, Caroline Aubry1, Antonia Hauswirth1, Evert Zinzen2, Prasath Jayakaran3 and Angela Blasimann1
1 Bern University of Applied Sciences, Department of Health Professions, Division of Physiotherapy
2 Vrije Universiteit Brussel, Faculty of Physical Education and Physiotherapy
3 University of Otago, School of Physiotherapy




Background: 
In higher education, assessments play an important role to evaluate learning outcomes. Multiple choice question (MCQ) is a favored tool to evaluate factual knowledge. 


Summary of work: 
The focus of this systematic review was to provide an overview of current practice and recommendations on designing MCQs in health professional education. 


Methods: 
The PICo tool (P: Population, Problems; I: Intervention or Phenomena of Interest; Co: Context) was used to formulate a research question regarding criteria for MCQs. Potential articles were identified by a Boolean search on the PubMed database with "multiple choice question," "item analysis OR number of items," AND "students” as search terms. In addition, hand searches were completed on the reference list of the included studies. Studies with qualitative, quantitative, and mixed-method designs were included. 


Results: 
24 articles were included from which eight main categories were identified. The MCQs should 1) contain clinical vignettes (n=2), 2) avoid sources of error so-called “cues” (n=1), 3) be fair (e.g. avoidance of complex language or sentence structure, question should match the case vignette) (n=3), 4) prefer a 3-option MC item (1 option which includes the key, plus 2 distractors) (n=6), 5) include 30 MC questions or more (n=3), 6) be item-analyzed to improve validity and reliability (n=5), 7) be made available to students to improve learning outcomes (n=2) and 8) use the number-right test (n = 5). 


Conclusion: 
In education of medical doctors and health professionals, MCQs should contain clinical vignettes, omit cues, be fair regarding language, include 3-option items, have 30 MC questions or more, be item-analyzed and available for students. The number-right scoring should be used as the test score. The correct answers are given a positive score and the total of the points for correct answers results in the test score. 


References (maximum three) 

  1. Al-Wardy, N. M. (2010). Assessment methods in undergraduate medical education. Sultan Qaboos University Medical Journal, 10(2), 203. 

  2. Palmer, E., & Devitt, P. (2006). Constructing multiple choice questions as a method for learning. Annals-academy of medicine Singapore, 35(9), 604. 

  3. Stern, T. (2014) What is good action research? Reflections about quality criteria. In Action Research, Innovation and Change: International perspectives across disciplines (pp. 202-220). Routledge. 

Quang Ngo1
Keyna Bracken2, Helen Neighbour1, Mike Lee-Poy1, Rebecca Long1, Jeffrey McCarthy1, Jeremy Sandor1 and Matthew Sibbald1
1 DeGroote School of Medicine, McMaster University
2 Michael G. DeGroote School of Medicine, McMaster University, AMEE member




Multiple choice exams (MCEs) in medical education are an efficient means to assess knowledge and clinical reasoning while maintaining standardization. 

There is debate in the literature as to whether MCEs should penalize wrong answers. Proponents argue this rewards accuracy while maintaining test validity. Those against state that it unfairly penalizes risk takers who are making educated, rather than uninformed guesses. Evidence is emerging that this may contribute to gender inequity in MCEs. 

The undergraduate medical education program at McMaster University uses a longitudinal progress test called the Personal Progress Inventory (PPI) to monitor knowledge acquisition, taken 8 times over the course of the program. Traditionally there has been a 0.25 mark penalty for incorrect answers. 

The penalty for guessing was removed in January 2023 due to concerns the penalty unfairly targeted risk averse groups and increased test anxiety without improving validity. This natural experiment allowed us to compare the effects of the penalty on exam outcomes. 

Means of the class scores before and after the removal of the guessing penalty were compared using ANOVA. Data was available for 2 sittings of the PPI (Feb and May/23) for the 1st and 2nd year cohorts. Compared to matched historical cohorts, the means were higher after elimination of the penalty for both sittings of the PPI and for both cohorts (p<0.05). No interaction was found between penalty and year. 

The reasons for the change in scores is likely multifactorial. Anxiety is known to hinder performance and reducing anxiety could contribute to improvement. With respect to test validity, we continue to see scores increase over time, regardless of penalty, suggesting the expertise gradient is preserved. Cut scores for this exam are norm-referenced and the number of below-threshold students has not changed. Future work should identify risk averse groups (i.e gender or specialty choice). 



References (maximum three) 

Lucy R. Betts, Tracey J. Elder, James Hartley & Mark Trueman (2009) Does correction for guessing reduce students’ performance on multiple‐choice examinations? Yes? No? Sometimes?, Assessment & Evaluation in Higher Education, 34:1, 1- 15, DOI: 10.1080/02602930701773091 

Coffman KB, Klinowski D. The impact of penalties for wrong answers on the gender gap in test scores. Proceedings of the National Academy of Sciences. 2020;117(16). pmid:32253310