Adam Lammert, Louis Goldstein, Shrikanth S. Narayanan, and Khalil Iskarous. Statistical Methods for Estimation of Direct and Differential Kinematics of the Vocal Tract. Speech Communication, 55(1):147–161, 2013.

Download

[PDF] 

Abstract

We present and evaluate two statistical methods for estimating kinematic relationships of the speech production system: artificial neuralnetworks and locally-weighted regression. The work is motivated by the need to characterize this motor system, with particular focuson estimating differential aspects of kinematics. Kinematic analysis will facilitate progress in a variety of areas, including the nature ofspeech production goals, articulatory redundancy and, relatedly, acoustic-to-articulatory inversion. Statistical methods must be used toestimate these relationships from data since they are infeasible to express in closed form. Statistical models are optimized and evaluated –using a heldout data validation procedure – on two sets of synthetic speech data. The theoretical and practical advantages of both methodsare also discussed. It is shown that both direct and differential kinematics can be estimated with high accuracy, even for complex, nonlinear relationships. Locally-weighted regression displays the best overall performance, which may be due to practical advantages inits training procedure. Moreover, accurate estimation can be achieved using only a modest amount of training data, as judged by convergenceof performance. The algorithms are also applied to real-time MRI data, and the results are generally consistent with thoseobtained from synthetic data.

BibTeX Entry

@article{Lammert2013StatisticalMethodsforEstimation,
 abstract = {We present and evaluate two statistical methods for estimating kinematic relationships of the speech production system: artificial neural
networks and locally-weighted regression. The work is motivated by the need to characterize this motor system, with particular focus
on estimating differential aspects of kinematics. Kinematic analysis will facilitate progress in a variety of areas, including the nature of
speech production goals, articulatory redundancy and, relatedly, acoustic-to-articulatory inversion. Statistical methods must be used to
estimate these relationships from data since they are infeasible to express in closed form. Statistical models are optimized and evaluated –
using a heldout data validation procedure – on two sets of synthetic speech data. The theoretical and practical advantages of both methods
are also discussed. It is shown that both direct and differential kinematics can be estimated with high accuracy, even for complex, nonlinear relationships. Locally-weighted regression displays the best overall performance, which may be due to practical advantages in
its training procedure. Moreover, accurate estimation can be achieved using only a modest amount of training data, as judged by convergence
of performance. The algorithms are also applied to real-time MRI data, and the results are generally consistent with those
obtained from synthetic data.},
 author = {Lammert, Adam and Goldstein, Louis and Narayanan, Shrikanth S. and Iskarous, Khalil},
 bib2html_rescat = {},
 doi = {dx.doi.org/10.1016/j.specom.2012.08.001},
 journal = {Speech Communication},
 link = {http://sail.usc.edu/publications/files/Lammert-SPECOM-InvKin2013.pdf},
 number = {1},
 pages = {147–161},
 title = {Statistical Methods for Estimation of Direct and Differential Kinematics of the Vocal Tract},
 volume = {55},
 year = {2013}
}

Generated by bib2html.pl (written by Patrick Riley ) on Tue Oct 17, 2017 10:45:56