ePoster
100% Page: /
Presentation Description
Syed Latifi
Mark Healy1
1 Weill Cornell Medicine-Qatar
Mark Healy1
1 Weill Cornell Medicine-Qatar
Background and Aim
The advent of Large Language Models (LLMs) has revolutionized the writing process. One intriguing application of LLMs is the generation of vignette-style multiple-choice questions (MCQs). Vignette-style MCQs involve the presentation of a brief scenario or vignette followed by a set of multiple-choice options, that are both contextually rich and diverse in content.
The aim of this e-poster is to demonstrate the potential of LLMs and compare the abilities of three popular LLMs, as a valuable resource-saving educational tool, in generating acceptable vignette-style questions.
Methodology
Three popular LLMs will be used to generate the questions. Prompts will be developed to generate these questions that will provide context to the question and relevance by mimicking real-world problem-solving scenarios. Initial prompts will be further refined through prompt engineering to generate MCQs with acceptable validity from the perspective of subject matter expert (faculty).
Next, each question will be evaluated by the faculty expert using a rubric for item writing flaws [1] and an Item-Writing Flaw Ratio will be computed [2]. The cognitive level of items will also be evaluated using Buckwalter’s rubric [3].
Results
The study is currently in the design phase, and the data will be collected between Sep-Oct 2023, where faculty will assess the item quality. The results will be analyzed and reported between Nov-Dec 2023.
Discussion and Implication for future
This study will have significant implications for the design and development of reliable assessments, in which the power of machines (LLMs) can be harnessed and utilized to create draft questions for faculty to review and refine.
Take home:
It is hypothesized that the questions generated via LLMs will be comparable to human-developed questions in terms of quality and educational effectiveness. This could alleviate the time and resource constraints associated with the conventional process of faculty generated item generation.
References (maximum three)
- Tarrant, M., & Ware, J. (2008). Impact of item‐writing flaws in multiple‐choice questions on student achievement in high‐stakes nursing assessments. Medical education, 42(2), 198-206.
- Przymuszała, P., Piotrowska, K., Lipski, D., Marciniak, R., & Cerbin-Koczorowska, M. (2020). Guidelines on writing multiple choice questions: a well-received and effective faculty development intervention. SAGE Open, 10(3), 2158244020947432.
- Buckwalter, J. A., Schumacher, R., Albright, J. P., & Cooper, R. R. (1981). Use of an educational taxonomy for evaluation of cognitive performance. Journal of Medical Education, 56(2), 115-21.