Research Publications

Dissertation

  • Vikram Ramanarayanan, Toward understanding speech planning by observing its execution – representations, modeling and analysis. Ph.D. Thesis, University of Southern California, 2014, 171; 3643149. [link]

Book Chapters

  • Vikram Ramanarayanan, David Suendermann-Oeft, Patrick Lange, Robert Mundkowsky, Alexei V. Ivanov, Zhou Yu, Yao Qian and Keelan Evanini (2017), Assembling the jigsaw: How multiple open standards are synergistically combined in the HALEF multimodal dialog system, in: Multimodal Interaction with W3C Standards: Towards Natural User Interfaces to Everything, D. A. Dahl, Ed., ed New York: Springer, 2017. [link]

  • David Suendermann-Oeft, Vikram Ramanarayanan, Moritz Teckenbrock, Felix Neutatz and Dennis Schmidt (2016), HALEF: an open-source standard-compliant telephony-based spoken dialog system – a review and an outlook, in proceedings of: Natural Language Dialog Systems and Intelligent Assistants, Springer, 2016. [link]

Journals

  • Vikram Ramanarayanan, Maarten Van Segbroeck, and Shrikanth S. Narayanan (2015), Directly data-derived articulatory gesture-like representations retain discriminatory information about phone categories, in: Computer Speech and Language. [link]

  • Ming Li, Jangwon Kim, Adam Lammert, Prasanta Ghosh, Vikram Ramanarayanan and Shrikanth Narayanan (2015). Speaker verification based on the fusion of speech acoustics and inverted articulatory signals, in: Computer Speech and Language. [link]

  • Adam Lammert, Louis Goldstein, Vikram Ramanarayanan, and Shrikanth S. Narayanan (2015), Gestural Control in the English Past-Tense Suffix: An Articulatory Study Using Real-Time MRI, in: Phonetica, 71 (229–248) (DOI:10.1159/000371820). [link] (Editor's Choice Article)

  • Vikram Ramanarayanan, Adam Lammert, Louis Goldstein, and Shrikanth S. Narayanan (2014), Are Articulatory Settings Mechanically Advantageous for Speech Motor Control?, in: PLoS ONE, 9(8): e104168. doi:10.1371/journal.pone.0104168. [link]

  • Shrikanth Narayanan, Asterios Toutios, Vikram Ramanarayanan, Adam Lammert, Jangwon Kim, Sungbok Lee, Krishna Nayak, Yoon-Chul Kim, Yinghua Zhu, Louis Goldstein, Dani Byrd, Erik Bresch, Prasanta Ghosh, Athanasios Katsamanis and Michael Proctor (2014), Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research, in: Journal of Acoustical Society of America, 136:3 (1307–1311). [link]

  • Vikram Ramanarayanan, Louis Goldstein, and Shrikanth S. Narayanan (2013), Articulatory movement primitives – extraction, interpretation and validation, in: Journal of the Acoustical Society of America, 134:2 (1378-1394). [link]

  • Vikram Ramanarayanan, Louis Goldstein, Dani Byrd and Shrikanth S. Narayanan (2013), A real-time MRI investigation of articulatory setting across different speaking styles, in: Journal of the Acoustical Society of America, 134:1 (510-519). [link]

  • Vikram Ramanarayanan, Erik Bresch, Dani Byrd, Louis Goldstein and Shrikanth S. Narayanan (2009), Analysis of pausing behavior in spontaneous speech using real-time magnetic resonance imaging of articulation, in: Journal of the Acoustical Society of America Express Letters, 126:5 (EL160-EL165). [link]

Published Research Reports

  • Vikram Ramanarayanan, David Suendermann-Oeft, Patrick Lange, Alexei V. Ivanov, Keelan Evanini, Zhou Yu, Eugene Tsuprun, and Yao Qian (2016), Bootstrapping Development of a Cloud-Based Spoken Dialog System in the Educational Domain From Scratch Using Crowdsourced Data, in: ETS Research Report Series, Wiley. doi: 10.1002/ets2.12105. [link]

Conference papers

  • Vikram Ramanarayanan, Patrick Lange, Keelan Evanini, Hillary Molloy, Eugene Tsuprun and David Suendermann-Oeft (2017). Crowdsourcing multimodal dialog interactions: Lessons learned from the HALEF case, in proceedings of: American Association of Artificial Intelligence (AAAI) 2017 Workshop on Crowdsourcing, Deep Learning and Artificial Intelligence Agents, San Francisco, CA, Feb 2017 [pdf].

  • Vikram Ramanarayanan, Patrick Lange, David Pautler, Zhou Yu and David Suendermann-Oeft (2016). Interview with an Avatar: A real-time engagement tracking-enabled cloud-based multimodal dialog system for learning and assessment, in proceedings of: Spoken Language Technology (SLT) 2016, San Diego, CA, Dec 2016 [pdf].

  • Hardik Kothare, Vikram Ramanarayanan, Benjamin Parrell, John F. Houde, Srikantan S. Nagarajan (2016). Sensorimotor adaptation to real-time formant shifts is influenced by the direction and magnitude of shift, in proceedings of: Society for Neuroscience Conference (SfN 2016), San Diego, CA, Nov 2016 [link].

  • Vikram Ramanarayanan, Benjamin Parrell, Louis Goldstein, Srikantan Nagarajan and John Houde (2016). A new model of speech motor control based on task dynamics and state feedback, in proceedings of: Interspeech 2016, San Francisco, CA, Sept 2016 [pdf].

  • Yao Qian, Jidong Tao, David Suendermann-Oeft, Keelan Evanini, Alexei V. Ivanov and Vikram Ramanarayanan (2016). Noise and metadata sensitive bottleneck features for improving speaker recognition with non-native speech input, in proceedings of: Interspeech 2016, San Francisco, CA, Sept 2016 [pdf].

  • Vikram Ramanarayanan and Saad Khan (2016). Novel features for capturing cooccurrence behavior in dyadic collaborative problem solving tasks, in proceedings of: Educational Data Mining (EDM 2016), Raleigh, North Carolina, June 2016 [pdf].

  • Zhou Yu, Vikram Ramanarayanan, Patrick Lange, Robert Mundkowsky and David Suendermann-Oeft (2016). Multimodal HALEF: An open-Source modular web-based multimodal dialog framework, in proceedings of: International Workshop on Spoken Dialog Systems (IWSDS 2016), Saariselka, Finland, Jan 2016 [pdf].

  • Alexei V. Ivanov, Patrick L. Lange, Vikram Ramanarayanan and David Suendermann-Oeft (2016). Designing an optimal ASR system for spontaneous non-native speech in a spoken dialog application, International Workshop on Spoken Dialog Systems (IWSDS 2016), Saariselka, Finland, Jan 2016 [pdf].

  • Vikram Ramanarayanan, Zhou Yu, Robert Mundkowsky, Patrick Lange, Alexei V. Ivanov, Alan W. Black, and David Suendermann-Oeft. A modular open-source standard-compliant dialog system framework with video support, in proceedings of: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2015), Scottsdale, AZ, Dec 2015 [pdf].

  • Zhou Yu, Vikram Ramanarayanan, David Suendermann-Oeft, Xinhao Wang, Klaus Zechner, Lei Chen, Jidong Tao and Yao Qian (2015). Using Bidirectional LSTM Recurrent Neural Networks to Learn High-Level Abstractions of Sequential Features for Automated Scoring of Non-Native Spontaneous Speech, in proceedings of: IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015), Scottsdale, AZ, Dec 2015 [pdf].

  • Vikram Ramanarayanan, Chee Wee Leong, Lei Chen, Gary Feng and David Suendermann-Oeft (2015). Evaluating speech, face, emotion and body movement time-series features for automated multimodal presentation scoring, in proceedings of: International Conference on Multimodal Interaction (ICMI 2015), Seattle, WA, Nov 2015 [pdf].

  • Vikram Ramanarayanan, David Suendermann-Oeft, Alexei V. Ivanov, and Keelan Evanini (2015). A distributed cloud-based dialog system for conversational application development, in proceedings of: 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL 2015), Prague, Czech Republic [pdf].

  • Alexei V. Ivanov, Vikram Ramanarayanan, David Suendermann-Oeft, Melissa Lopez, Keelan Evanini, and Jidong Tao (2015). Automated speech recognition technology for dialogue interaction with non-native interlocutors, in proceedings of: 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL 2015), Prague, Czech Republic [pdf].

  • Vikram Ramanarayanan, Lei Chen, Chee Wee Leong, Gary Feng and David Suendermann-Oeft (2015). An analysis of time-aggregated and time-series features for scoring different aspects of multimodal presentation data, in proceedings of: Interspeech 2015, Dresden, Germany, Sept 2015 [pdf].

  • Zisis Skordilis, Vikram Ramanarayanan, Louis Goldstein and Shrikanth Narayanan (2015). Experimental assessment of the tongue incompressibility hypothesis during speech production, in proceedings of: Interspeech 2015, Dresden, Germany, Sept 2015 [pdf].

  • David Suendermann-Oeft, Vikram Ramanarayanan, Moritz Teckenbrock, Felix Neutatz and Dennis Schmidt (2015). HALEF: an open-source standard-compliant telephony-based modular spoken dialog system – A review and an outlook, in proceedings of: International Workshop on Spoken Dialog Systems, Busan, South Korea, Jan 2015 [pdf].

  • Vikram Ramanarayanan, Louis Goldstein and Shrikanth Narayanan (2014). Speech motor control primitives arising from a dynamical systems model of vocal tract articulation, in proceedings of: Interspeech 2014, Singapore, Sept 2014 [pdf].

  • Colin Vaz, Vikram Ramanarayanan and Shrikanth Narayanan (2014). Joint filtering and factorization for recovering latent structure from noisy speech data, in proceedings of: Interspeech 2014, Singapore, Sept 2014 [pdf].

  • Andres Benitez, Vikram Ramanarayanan, Louis Goldstein and Shrikanth Narayanan (2014). A real-time MRI study of articulatory setting in second language speech, in proceedings of: Interspeech 2014, Singapore, Sept 2014 [pdf].

  • Vikram Ramanarayanan, Louis Goldstein and Shrikanth Narayanan (2014). Speech motor control primitives arising from a dynamical systems model of vocal tract articulation, in proceedings of: International Seminar on Speech Production 2014, Cologne, Germany, May 2014 [pdf]. (Northern Digital Inc. Excellence Award for Best Paper)

  • Vikram Ramanarayanan, Adam Lammert, Louis Goldstein and Shrikanth Narayanan (2013). Articulatory settings facilitate mechanically advantageous motor control of vocal tract articulators, in proceedings of: Interspeech 2013, Lyon, France, Aug 2013 [pdf].

  • Vikram Ramanarayanan, Maarten Van Segbroeck and Shrikanth Narayanan (2013). On the nature of data-driven primitive representations of speech articulation, in proceedings of: Interspeech 2013 Workshop on Speech Production in Automatic Speech Recognition (SPASR), Lyon, France, Aug 2013 [pdf].

  • Colin Vaz, Vikram Ramanarayanan and Shrikanth Narayanan (2013). A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis, in proceedings of: Interspeech 2013, Lyon, France, Aug 2013 [pdf]. (Best Student Paper Award)

  • Zhaojun Yang, Vikram Ramanarayanan, Dani Byrd and Shrikanth Narayanan (2013). The effect of word frequency and lexical class on articulatory-acoustic coupling, in proceedings of: Interspeech 2013, Lyon, France, Aug 2013 [pdf].

  • Adam Lammert, Vikram Ramanarayanan, Michael Proctor and Shrikanth Narayanan (2013). Vocal tract cross-distance estimation from real-time MRI using region-of-interest analysis, in proceedings of: Interspeech 2013, Lyon, France, Aug 2013 [pdf].

  • Daniel Bone, Chi-Chun Lee, Vikram Ramanarayanan, Shrikanth Narayanan, Renske S. Hoedemaker and Peter C. Gordon (2013). Analyzing eye-voice coordination in Rapid Automatized Naming, in proceedings of: Interspeech 2013, Lyon, France, Aug 2013 [pdf].

  • Ming Li, Jangwon Kim, Prasanta Ghosh, Vikram Ramanarayanan and Shrikanth Narayanan (2013). Speaker verification based on fusion of acoustic and articulatory information, in proceedings of: Interspeech 2013, Lyon, France, Aug 2013 [pdf].

  • Vikram Ramanarayanan, Prasanta Ghosh, Adam Lammert and Shrikanth S. Narayanan (2012), Exploiting speech production information for automatic speech and speaker modeling and recognition – possibilities and new opportunities, in proceedings of: APSIPA 2012, Los Angeles, CA, Dec 2012 [pdf].

  • Vikram Ramanarayanan, Naveen Kumar and Shrikanth S. Narayanan (2012), A framework for unusual event detection in videos of informal classroom settings, in: NIPS 2012 workshop on Personalizing Education with Machine Learning, Lake Tahoe, NV, Dec 2012 [pdf].

  • Vikram Ramanarayanan, Athanasios Katsamanis and Shrikanth Narayanan (2011). Automatic data-driven learning of articulatory primitives from real-time MRI data using convolutive NMF with sparseness constraints, in proceedings of: Interspeech 2011, Florence, Italy, Aug 2011 [pdf].

  • Athanasios Katsamanis, Erik Bresch, Vikram Ramanarayanan and Shrikanth Narayanan (2011). Validating rt-MRI based articulatory representations via articulatory recognition, in proceedings of: Interspeech 2011, Florence, Italy, Aug 2011 [pdf].

  • Shrikanth Narayanan, Erik Bresch, Prasanta Ghosh, Louis Goldstein, Athanassios Katsamanis, Yoon Kim, Adam Lammert, Michael Proctor, Vikram Ramanarayanan, and Yinghua Zhu (2011). A Multimodal Real-Time MRI Articulatory Corpus for Speech Research, in proceedings of: Interspeech 2011, Florence, Italy, Aug 2011 (authors after first in alphabetical order) [pdf].

  • Vikram Ramanarayanan, Dani Byrd, Louis Goldstein and Shrikanth Narayanan (2011). An MRI study of articulatory settings of L1 and L2 speakers of American English, in: International Speech Production Seminar 2011, Montreal, Canada, June 2011 [pdf].

  • Vikram Ramanarayanan, Adam Lammert, Dani Byrd, Louis Goldstein and Shrikanth Narayanan (2011). Planning and Execution in Soprano Singing and Speaking Behavior: an Acoustic/Articulatory Study Using Real-Time MRI, in: International Speech Production Seminar 2011, Montreal, Canada, June 2011 [pdf].

  • Vikram Ramanarayanan, Dani Byrd, Louis Goldstein and Shrikanth Narayanan (2010). Investigating articulatory setting - pauses, ready position and rest - using real-time MRI, in proceedings of: Interspeech 2010, Makuhari, Japan, Sept 2010 [pdf].

  • Vikram Ramanarayanan, Dani Byrd, Louis Goldstein and Shrikanth Narayanan (2010). A joint acoustic-articulatory study of nasal spectral reduction in read versus spontaneous speaking styles, in: Speech Prosody 2010, Chicago, Illinois, May 2010 [pdf].

  • Vikram Ramanarayanan (2010). Prosodic variation within speech planning and execution - insights from real-time MRI, in: Sao Paulo School of Speech Dynamics, Sao Paulo, Brazil, June 2010 (unpublished poster summarizing research work done during 2009-10) [pdf].

  • Vikram Ramanarayanan, Erik Bresch, Dani Byrd, Louis Goldstein and Shrikanth S. Narayanan (2009), Real-time MRI tracking of articulation during grammatical and ungrammatical pauses in speech, in: 157th Meeting of the Acoustical Society of America, Portland, Oregon, May 2009 [pdf].

  • Ed Holsinger, Vikram Ramanarayanan, Dani Byrd, Louis Goldstein, Maria Gorno Tempini, Shrikanth Narayanan (2009). Beyond acoustic data: Characterizing disordered speech using direct articulatory evidence from real time imaging, in: 157th Meeting of the Acoustical Society of America, Portland, Oregon, May 2009 [pdf].

Technical reports

  • Vikram Ramanarayanan, Panayiotis Georgiou and Shrikanth S. Narayanan (2012). Investigating duration modeling within a statistical data-driven front-end for speech synthesis.

  • Vikram Ramanarayanan and Shrikanth Narayanan (2010). An approach toward understanding the variant and invariant aspects of speech production using low-rank–sparse matrix decompositions [pdf].