Skip to main content
Ottawa 2024
Times are shown in your local time zone GMT

Artificial intelligence

Oral Presentation

Oral Presentation

10:00 am

28 February 2024

M208

Session Program

Kellie Charles1
Slade Matthews1 and Tina Hinton1
1 Sydney Pharmacy School, University of Sydney 



Background:
Health professional programs are adopting research capstone units to cultivate skills for the future research workforce. These units often involve substantial written assessments that summarise extensive literature. With the advent of large language models, academics are grappling with the decision of integrating or prohibiting AI support in written assessments. 


Summary of Work:
Responding to the evolving AI landscape, we adopted a design-thinking approach to reshape the curriculum and assessment framework of a 13-week project course. Formerly, the assessment consisted of group oral presentations and a final literature review. The revamped version integrates AI ethically into the research process. Students now undergo a series of low-stakes assessments, emulating creative research responses to real-world public health challenges. Guided by an authentic health brief (1), students gradually craft a multimedia plan for a 60-minute educational activity and an evaluation strategy tailored to a chosen community. This revised assessment structure empowers students to tackle the pressing issue of vaping in our society using AI for information gathering and processing. Rubrics for group assessments were updated to evaluate knowledge application, including AI's role in producing assessable outputs. 


Results:
The 2023 implementation of the restructured research capstone project is in progress, with an evaluation study combining student-generated assessments, year-end surveys, and qualitative analysis of student reflections slated for December. 


Discussion:
Responsible use of AI in research is a crucial skill for all healthcare students. Educators must explore innovative ways to infuse new technologies into more creative assessments. This capstone project offers a potential model for addressing a wide spectrum of healthcare challenges. It facilitates students' ethical use of AI, while promoting critical thinking and creativity in addressing complex public health issues. The lessons learned from this project can extend to various healthcare contexts, fostering future-ready professionals prepared to embrace technology's potential while maintaining academic integrity. 



References (maximum three) 
none 

Ankita Vayalapalli1
Mesk M Nafea1, Vivien Makhoul1 and Rodger D MacArthur1,2
1 Office of Academic Affairs, Medical College of Georgia at Augusta University, Augusta, Georgia U.S.
2 Division of Infectious Diseases, Medical College of Georgia at Augusta University, Augusta, Georgia U.S.
 


The popularity of Artificial Intelligence programs like ChatGPTv3.5 implores the need for a better understanding of its uses in medical training. We assessed the accuracy and reliability of ChatGPT on a standardized United States Medical Licensing Exam Step 1. While others have assessed the accuracy of ChatGPT, this work is the first of its kind to assess both accuracy and reliability. 

The dataset was obtained from the 2013-2014 Free120, a USMLE Step 1 practice exam of 120 multiple choice questions by the National Board of Medical Examiners. To ensure reliability, we conducted three runs on the same dataset of 120 questions. In each run, we asked ChatGPT the full set of questions and recorded the responses. For accuracy, we compared answer outputs from ChatGPT to the answer key. 

ChatGPT answered with an accuracy of 77.6%. A chi squared analysis yielded Xsquared(2, N= 109)=3.33, p =0.77. ChatGPT performed most accurately on pathology questions (81%) and least accurately on ethics questions (61%). Across the three trials of the same questions, ChatGPT changed answers 31% of the time. 

With the rapid emergence and utilization of ChatGPT amongst medical students to prepare for assessments, it is important to understand its accuracy and reliability. Our results suggest that ChatGPT is lacking in both. Compared to other study tools, ChatGPT grossly falls short of even near-perfect accuracy. Most alarming is the lack of reliability, as ChatGPT failed to remain consistent between trials. 

Despite ChatGPT achieving a “Pass” on USMLE Step 1, medical students must be aware of its deficits in accuracy and reliability. Inconsistency between trials indicates that the technology is not only inaccurate, but also inconsistently inaccurate. Invariably, Artificial Intelligence is garnering immense traction and it is of utmost importance to learn how to navigate such technologies by demystifying major limitations of programs like ChatGPT. 



References (maximum three) 

1. Biswas, S. (2023), ChatGPT and the Future of Medical Writing. Radiology. https://pubs.rsna.org/doi/10.1148/radiol.223312 

2. Johnson, D., et al. (2023), Assessing the Accuracy and Reliability of AI-Generated Medical Responses: An Evaluation of the Chat-GPT Model 

3. Lund, B.D. and Wang, T. (2023), "Chatting about ChatGPT: how may AI and GPT impact academia and libraries?", Library Hi Tech News, Vol. 40 No. 3, pp. 26-29. 

Maxim Morin1
Kim Ashwin2, Paul Glover3, Jon Dupre3 and Jen Desrosiers2
1 Medical Council of Canada
2 Australian Medical Council
3 risr/




Background.
 
Artificial intelligence (AI) and machine learning (ML) have emerged as transformative technologies in various fields, including high-stakes certification exams (Nie et al., 2023). However, research and practice in this field must actively explore and evaluate the role of these technologies to ensure fair, valid, and efficient examination processes. 


Summary of work.
This oral presentation explores the current and potential uses of AI and ML, including in areas such as preparatory materials, exam content development, exam delivery, exam scoring, and exam analytics. It will examine the potential benefits, challenges, and implications of leveraging these technologies in high-stakes certification exams (Alrassi et al., 2021). 


Importance for research and practice.
It aims to inform the design and implementation of AI and ML technologies in high-stakes certification exams. They will also guide future research, policy development, and practice improvement in this evolving field. 


Take-home messages, outcomes, and implications for further research and practice: 

  • Enhanced understanding of the role of AI and ML in high-stakes certification exams. 

  • Identification of the use cases for AI and ML technologies and their potential benefits. 

  • Exploration of best practices and guidelines for incorporating AI and ML into exam design, development, and evaluation processes. 

  • Identification of research gaps and the need for further studies to evaluate the validity, reliability, and fairness of AI and ML-powered certification exams. 


References (maximum three)

Alrassi, J., Katsufrakis, P. J., & Chandran, L. (2021). Technology Can Augment, but Not Replace, Critical Human Skills Needed for Patient Care. Academic Medicine, 96(1). https://journals.lww.com/academicmedicine/Fulltext/2021/01000/Technology_Can_Augment ,_but_Not_Replace,_Critical.33.aspx 

Masters, K. (2023). Ethical use of Artificial Intelligence in Health Professions Education: AMEE Guide No. 158. Medical Teacher, 45(6), 574–584. https://doi.org/10.1080/0142159X.2023.2186203 

Nie, R., Guo, Q., & Morin, M. (2023). Machine Learning Literacy for Measurement Professionals: A Practical Tutorial. Educational Measurement: Issues and Practice, 42(1), 9– 23. https://doi.org/https://doi.org/10.1111/emip.12539 

Marcos Rojas1
Sharon F. Chen1, Kathleen Gutierrez1, Argenta Price1 and Shima Salehi1
1 Stanford University



Background:
Clinical reasoning (CR) is pivotal in healthcare education, yet its reflection component is often underrepresented in assessment tools (1). Inspired by STEM's problem-solving focus (2), our study presents a unique AI-driven tool that delves into the CR process, capturing often- overlooked reflection practices and pinpointing areas of enhancement. By harnessing AI methodologies, we not only provide the possibility of evaluating reflection but also magnify the scalability of this tool, positioning it to strengthen clinical reasoning education across all stages of medical training. 


Summary of work:
Previously, we designed an online assessment for physicians and students to capture their CR execution steps and reflective practices behind them. After initial data evaluation, the assessment was modified for better reflection capture and tested on medical practitioners from first-year medical students to expert physicians. Their responses and feedback shaped the revised assessment and a reflection-focused scoring codebook. Concurrently, an AI model is under development for future autonomous grading. 


Results:
As of August 2023, our pilot assessment involved two medical students, a resident, and a physician, with an emphasis on CR reflection. Our plan is to extend this pilot to six more participants from diverse medical career stages by September 2023. Post-pilot, we will examine the AI's grading capability against human coders. 


Discussion:
Melding AI capabilities with CR assessments offers profound insights into medical decision- making. The assessment, focusing on both execution and underlying reflective thought, pinpoints clinical reasoning gaps, paving the way for tailored educational interventions. 


Conclusions:
Integrating AI into CR assessment yields profound insights into medical cognitive processes. This innovative tool enhances evaluation depth and promotes continuous learning in medical education. 


Take-home message:
Utilizing AI in clinical reasoning assessment presents a pivotal step in refining and advancing medical education assessment methods. 



References (maximum three) 

1. Daniel M, Rencic J, Durning SJ, Holmboe E, Santen SA, Lang V, et al. Clinical Reasoning Assessment Methods: A Scoping Review and Practical Guidance. Acad Med. 2019 Jun;94(6):902–12. 

2. Price A, Salehi S, Burkholder E, Kim C, Isava V, Flynn M, et al. An accurate and practical method for assessing science and engineering problem-solving expertise. Int J Sci Educ. 2022 Sep 2;44(13):2061–84.