2026 Combined Otolaryngology Spring Meetings

Poster ALA073

Status: File under review
Section: ALA
Track: Clinical
Presenter: Aki Koivu PhD

Real-time Classification of Functional Laryngeal Behaviors from Vocal Fold Kinematics

Author(s)
Aki Koivu, PhD
Pin-Yu Lin, Msc
Kristina Simonyan, MD. PhD
DrMed
Matthew R. Naunheim, MD, MBA

Affiliation(s)
Massachusetts Eye and Ear & Harvard Medical School

Abstract:
Objectives: Develop a real-time prediction model to automatically detect laryngeal behaviors from videolaryngoscopy-based pose tracking data during clinical examinations. This will improve data collection quality by providing immediate feedback to clinicians and enable future automation pipelines for efficient large-scale analysis.
Study Design: Proof-of-concept study of a real-time integration of identifying patient behavior states during in-office laryngoscopy using motion tracking.
Methods: We trained a real-time stateful residual Gated Recurrent Unit (GRU) model that analyzed the tracking data of 39 laryngeal keypoints, derived from our previously published keypoint detection model, to predict the patient’s laryngeal task state. These included ‘phonation’, ‘sustained phonation’, ‘swallowing’, ‘idle’, ‘coughing’, ‘sniffing’, and ‘out of view’. 916 segments from 72 laryngoscopy videos were annotated for model training. The resulting model was then evaluated on an independent dataset of 123 segments from 10 videos. Performance was assessed by comparing manual annotations and model predictions using classification metrics and temporal intersection over union (mIoU).
Results: The model achieved a mean accuracy score of 92% and an mIoU of 0.82 when evaluated on the independent test dataset, indicating both agreement with manual annotations and accurate temporal identification of laryngeal task states. At the class level, the model performed consistently across most state categories, with per-class F1-scores on the test set ranging from 83% for phonation to 97% for sniffing. The lowest validation F1 was observed for cough (82% [70%–92%]).
Conclusions: Our study demonstrates how our developed state classification model can identify patients’ laryngeal behaviors, such as swallowing and phonation, using xy coordinates produced by our existing keypoint tracking model. Future work will focus on refining precision and further characterizing the clinical utility of the proposed model.

ALA073 - Real-time Classification of Functional Laryngeal Behaviors from Vocal Fold Kinematics
ALA074 - Results of Office and Operative Nanofat Augmentation for Glottic Insufficiency
ALA075 - Rural-Urban Disparities in Airway Management: Tracheostomy Outcomes in a National Inpatient Sample
ALA076 - Semi-Occluded Vocal Tract Exercises on Youtube: Content and Sentiment Analyses
ALA077 - Severe Vocal Fold Atrophy and Dysphonia Associated With GLP-1 Receptor Agonist (Tirzepatide) Therapy: A Case Report
ALA078 - Sex-Specific Associations between Levodopa, Amantadine, and Laryngeal Dysphonia in Parkinson's Disease: Insights from the All of Us Research Program
ALA079 - Spatial Transcriptomic Characterization of Human Vocal Folds Unveils Novel Laryngeal Immune Environment
ALA080 - The Development of a Post-Sistrunk Procedure Vallecular Diverticulum and the Efficacy of Injection Augmentation and Fiberoptic Endoscopic Evaluation of Swallow (FEES) Exam Biofeedback to Alleviate Resulting Symptoms
ALA081 - The Effect of Body Mass Index on Dyspnea After Medialization Laryngoplasty With or Without Arytenoid Adduction for Unilateral Vocal Fold Paralysis
ALA082 - The Effect of Fluid Intake on Vocal Tract Symptoms
ALA083 - The Effect of Practice on Mental Effort Reduction in Intentional Voice Quality Alteration: A Pupillometry Study
ALA084 - The Relationship Between Adult-Onset Testosterone Deficiencies and Clinically Diagnosed Dysphonia
ALA085 - The Relationship Between Preoperative Glycemic Control and 30-Day Post-Laryngectomy Complications: A Multi-Institutional Propensity-Matched Analysis
ALA086 - The Speech Language Pathologist's Role and Value in the Botulinum Toxin Injection Clinic

POSTERS 225-238 OF 891

True