A MATLAB software for robust vocal tract parameter extraction

This software contains MATLAB codes for extracting vocal tract parameters robustly in the upper airway images in the mid-sagittal plane.
This software was developed for automatic and systematic analysis of the vocal tract images recorded using real-time magnetic resonance imaging (rtMRI).
The software performs the followings:
(1) Image enhancement: suppression of grainy noise, retrospective field sensitivity correction, increasing the contrast between tissues and airway
(2) Semi-automatic grid line construction based on a manually drawn line as the center of the grid lines
(3) Automatic tracking of the front-most edge of the lips and the top of the larynx (arytenoid muscles)
(4) Automatic estimation of an oral pharyngeal airway path within the vocal tract walls
(5) Automatic segmentation of the airway-tissue boundaries in the vocal tract
(6) Computing the distance function between the upper and lower boundaries
(7) Computing the vector for the upper airway shape
(8) Computing the vocal tract length for each MR image

See our paper [1] for technical details.
Contact Jangwon Kim for any question and suggestion.
Email: jangwon@usc.edu


Version 4.0 (TAR) (ZIP) (updated on Sep 2 2015)
This MATLAB software package includes
(1) Aforementioned MATLAB functions
(2) A subset of the USC-EMO-MRI corpus [2] for demo (in the directory 'rtMRIdata')
(3) A wrapper for demo of the entire process in batch
(4) A MATLAB code for manual head correction.
The development of the software is on-going project.
We are currently working on (i) adaptive grid line construction for handling head movement issue in the images and (ii) automatic parameterization of vocal-tract morphology.

Demo video: Extracted vocal tract parameters in rtMRI video

A male subject (a native speaker of American English) read "the Grandfather" passage in the MRI scanner.
The passage: You wished to know all about my grandfather. Well, he is neaerly ninety-three years old; he dresses himself in an ancient black frock coat usually minus several buttons; yet he still thinks as swiftly as ever. A long, flowing beard clings to his chin, giving those who observe him a pronounced feeling of the utmost respect. When he speaks, his voice is just a bit cracked and quivers a trifle. Twice each day he plays skillfully and with zest upon our small organ. Except in the winter when the ooze or snow or ice prevents, he slowly takes a short walk in the open air each day. We have often urged him to walk more and smoke less, but he always answers, "Banana oil!" Grandfather likes to be modern in his language.

Video frame rate: 23.130 frame/second.

Original MR images

Enhanced images

Lip and larynx tracking

- Yello line: the top of the larynx (arytenoid muscles)
- Red line: the front-most edge of the lips

Oral pharyngeal airway path estimation

Estimation of tissue-airway boundaries

Green line: smoothed upper boundary
Red line: smoothed lower boundary

Estimation of distance function between the vocal tract walls

Top panel: Estimated tissue-airway boundaries
- Green line: cleaned upper boundary
- Red line: cleaned lower boundary
Bottom panel: Estimated distance between the upper and lower
boundaries along the upper airway


[1] Jangwon Kim, Naveen Kumar, Sungbok Lee and Shrikanth Narayanan, "Enhanced airway-tissue boundary segmentation for real-time magnetic resonance imaging data," in Proceedings of 10-th International Seminar on Speech Production (ISSP), Cologne, Germany, 2014
This paper won the Northen Digital In. Excellence Awards in ISSP 2014.
[2] Jangwon Kim, Asterios Toutios, Yoon-Chul Kim, Yinghua Zhu, Sungbok Lee and Shrikanth Narayanan, "USC-EMO-MRI corpus: An emotional speech production database recorded by real-time magnetic resonance imaging," in Proceedings of 10-th International Seminar on Speech Production (ISSP), Cologne, Germany, 2014

Updated by Jangwon Kim on May 15 2014