Human vs AI: Evaluating Large Language Models (LLMs) in the Grading of Scientific Inquiry Assessments (83651)
Session Chair: Ivan Cherh Chiet Low
Wednesday, 27 November 2024 12:15
Session: Session 2
Room: Room 603 (6F)
Presentation Type: Oral Presentation
The advent of AI, particularly Large Language Models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to evaluate the accuracy and reliability of these LLMs against human graders in an interdisciplinary course on scientific inquiry. Human graders and three LLMs, GPT-3.5, GPT-4, and Gemini, were tasked with scoring submitted student assignments according to a set of rubrics aligned with various cognitive domains, namely ‘Understand’, ‘Analyse’, ‘Evaluate’ from the revised Bloom’s taxonomy, and ‘Scientific inquiry competency’. Our findings revealed that whilst LLMs demonstrated some level of competency, they do not yet meet the assessment standards of human graders. Specifically, inter-rater reliability (percentage agreement and correlation analysis) between human graders were superior compared to between two grading rounds for each LLM respectively. Furthermore, concordance and correlation between human and LLM graders were moderate to mostly poor in terms of overall scores and across the pre-specified cognitive domains. The results suggest a future where AI could complement human expertise in educational assessment, but underscores the importance of adaptive learning by educators and continuous improvement in current AI technologies to fully realize this potential.
Authors:
Ivan Cherh Chiet Low, National University of Singapore, Singapore
Swapna Haresh Teckwani, National University of Singapore, Singapore
Amanda Huee-Ping Wong, National University of Singapore, Singapore
Nathasha Luke, National University of Singapore, Singapore
About the Presenter(s)
Dr. Ivan Low is a Senior Lecturer and the Director for Continuous Education Training (CET) in the Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore (NUS).
Connect on Linkedin
https://www.linkedin.com/in/ivan-low-8a599878
See this presentation on the full schedule – Wednesday Schedule
A Note to Presenters
To enhance academic profiles and showcase research, we encourage all presenters and co-presenters to include links to their public LinkedIn, ResearchGate profile, and research websites. Presenters may update their bio for their presentation by completing the form linked below by October 22, 2024.- Presenter Information Update Form
Submitted changes will be reflected on November 01, 2024
Additionally, presenters should also update their IAFOR account details if there have been any changes to affiliations or biographies.
- https://submit.iafor.org/my-account/edit-account
Comments
Powered by WP LinkPress