2024 Combined Otolaryngology Spring Meetings

Poster I045

Status: File under review
Section: TRIO
Track: General
Presenter: Young Lee MS

Comparison of AI Models for Conducting Systematic Literature Reviews in Otolaryngology-Head and Neck Surgery

Author(s)
Young Lee
Ajibola B. Bakare
Jhuree Hong
Claus-Peter Richter
Jonathan Kuriakose

Affiliation(s)
Central Michigan University College of Medicine; Tulane School of Medicine; Freelance; Northwestern Medicine Department of Otolaryngology ;

Abstract:

Educational Objective: At the conclusion of this presentation, the participants should be able to understand the potential use of LLMs in systematic reviews of topics within OHNS.
Objectives: Large language models (LLMs), such as ChatGPT and Bard, are generative deep learning algorithms that can process large datasets. There are several potential uses for LLMs within medical research; however, few have examined their potential in carrying out systematic reviews in otolaryngology-head and neck surgery (OHNS). This study aims to compare the efficacy of ChatGPTv3.5 and Bard in conducting systematic literature reviews within OHNS.
Study Design: Literature review comparative analysis.
Methods: The methods of three systematic reviews, which used PRISMA guidelines, were replicated using ChatGPTv3.5 and Bard. The outputs generated were compiled by author, paper title, publication year, and journal and compared to reference articles cited in the systematic review. Each output was cross-referenced with medical databases, to determine authenticity of the outputs' journals.
Results: Several themes emerged comparing Bard and ChatGPT across the three reference systematic reviews. In replicating Wong et al.’s review, Bard generated more outputs than ChatGPT. Furthermore, Bard demonstrated a broader date range than ChatGPT in replicating Jabbour et al.’s review. Finally, in Wu et al.’s review, ChatGPT#2 identified more genuine outputs than Bard#2.
Conclusions: LLMs did not accurately replicate the methodology of a peer reviewed manuscript and should be utilized with caution. The outputs contained several inaccuracies, ranging from fictitious citations to citations with partial truths. Neither Bard nor ChatGPT provided good accuracy or identification of authentic papers suitable for systematic reviews. PRISMA and other literature review guidelines remain the gold standard.

I045 - Comparison of AI Models for Conducting Systematic Literature Reviews in Otolaryngology-Head and Neck Surgery
I046 - Cranioplasty Outcomes of Synthetic Material vs Autologous Bone Grafts in Head Trauma Patients with Early vs Late Operations: A Multi-National Database Study
I047 - CPAP Compliance and Followup Rates: A Comparative Study between Otolaryngologic and Sleep Medicine Management Practices
I048 - ChatENT: Augmented Large Language Model for Expert Knowledge Retrieval in Otolaryngology - Head and Neck Surgery
I049 - Validity of Novel Classification of Patient Safety and Patient Centered Outcomes
I050 - Comparative Analysis of American Otolaryngology Societies in the Combined Otolaryngology Spring Meetings
I051 - Who Gets the Stim? A Geospatial and Demographic Analysis of Hypoglossal Nerve Stimulator (HNS) Patients
I052 - Transoral Robotic Surgery Versus Upper Airway Stimulation for Obstructive Sleep Apnea: A Cost Analysis Study
I053 - Improvement of Depression in Adults with Surgical Management of OSA Compared to CPAP: A Systematic Review and Meta-Analysis
I054 - Analysis of Research Activity and Industry Funding among Skull Base Fellowship Program Directors
I055 - Nutritional Impact of COVID-19 Induced Olfactory Dysfunction
I056 - Seasonal Variation in Post-Tonsillectomy Hemorrhage: A Systematic Review and Meta-Analysis
I057 - Harnessing Artificial Intelligence to Improve the Readability of Patient Educational Materials in General Otolaryngology
I058 - Necessity of Immediate Postoperative Chest X-Ray after Hypoglossal Nerve Stimulator Implantation

POSTERS 841-854 OF 944

True