Real-Time 3D Reconstruction of Human Vocal Folds via High-Speed Laser-Endoscopy



Conventional video endoscopy and high-speed video endosc-opy of the human larynx solely provides practitioners with information about the two-dimensional lateral and longitudinal deformation of vocal folds. However, experiments have shown that vibrating human vocal folds have a significant vertical component. Based upon an endoscopic laser projection unit (LPU) connected to a high-speed camera, we propose a fully-automatic and real-time capable approach for the robust 3D reconstruction of human vocal folds. We achieve this by estimating laser ray correspondences by taking epipolar constraints of the LPU into account. Unlike previous approaches only reconstructing the superior area of the vocal folds, our pipeline is based on a parametric reinterpretation of the M5 vocal fold model as a tensor product surface. Not only are we able to generate visually authentic deformations of a dense vibrating vocal fold model, but we are also able to easily generate metric measurements of points of interest on the reconstruced surfaces. Furthermore, we drastically lower the effort needed for visualizing and measuring the dynamics of the human laryngeal area during phonation. Additionally, we publish the first publicly available labeled in-vivo dataset of laser-based high-speed laryngoscopy videos.


Code is available on github