Research Interests
Academic Experience
Jangwon Kim
Download Resume (Updated on Sep. 13 2015)
Download CV (Updated on Mar 17 2015)
Research Interests
- Generic categories:
My research is at the nexus of the science and engineering of human communication and information processing. Specific topics include:
- Modeling and measurement of speech production with applications to synthesis, recognition and clinical assessment of speech
- Computational paralinguistics: Modeling, detection and tracking of emotion, gender, health, etc.
- Machine learning for speech processing: Winners of Interspeech sub-challenges
- Multimodal signal processing, image processing and imaging applications
- Robust speech processing, Automatic Speech Recognition (ASR), Biometrics
- For lay audiences: I try to understand how humans make sounds when they speak, and how the way of speaking changes when they feel happy, sad, or angry. I study movies showing the inside of people's head and neck, and sounds made when they have different feelings. The movies allow me to see how humans control their mouth, which tell me about how humans control the expression of feeling. I also study the relationship between sounds and these controls. Eventually, I want to make computers have a human-like conversations with people. The computers should understand the feelings of the human, and should respond with feeling.
(This was created by only the 1000 most frequently used words in the English language.)
Academic Experience
Research assistant (Fall 2010 ~ Fall 2015)
- Emotional speech production at Signal Analysis and Interpretation Laboratory (SAIL)
- Supervisor: Prof. Shrikanth S. Narayanan
Teaching assistant
- EE519 (Fall 2013): Speech recognition and processing for multimedia
- Graduate-level speech signal processing course
- Instructor: Prof. Shrikanth S. Narayanan
- TA evaluation summary (max. score: 5)
- EE619 (Spring 2013): Advanced topics in automatic speech recognition
- Graduate-level automatic speech recognition course
- Instructor: Prof. Shrikanth S. Narayanan
- Developed a tutorial for phone recognition using HTK; This tutorial includes line-by-line commands, python scripts for specific procedures, a python batch script, and short explanation.
- TA evaluation summary (max. score: 5)
- EE519 (Fall 2012): Speech recognition and processing for multimedia
- Graduate-level speech signal processing course
- Instructor: Prof. Shrikanth S. Narayanan
- TA evaluation summary (max. score: 5)
Book chapter
- Chi-Chun Lee, Jangwon Kim, Angeliki Metallinou, Carlos Busso, Sungbok Lee and Shrikanth S. Narayanan, "Speech in Affective Computing," in: R.A. Calvo, S.K. D'Mello, J. Gratch and A. Kappas (Eds). Oxford Handbook of Affective Computing, Oxford University Press, pages 170 - 183, 2015
Journal articles
- Jangwon Kim, Asterios Toutios, Sungbok Lee, and Shrikanth Narayanan, "Vocal tract shaping of emotional speech," in The Journal of the Acoustical Society of America, 2015 (Under Review)
- Jangwon Kim, Donna Erickson, Sungbok Lee, "More about contrastive emphasis and the C/D model," The Journal of the Phonetic Society of Japan, 2015 (In Press)
- Ming Li, Jangwon Kim, Adam Lammert, Prasanta K Ghosh, Vikram Ramanarayanan and Shrikanth Narayanan, "Speaker verification based on the fusion of speech acoustics and inverted articulatory signals," in Computer Speech and Language, 2015 (LINK) (In Press)
- Jangwon Kim, Naveen Kumar, Andreas Tsiartas and Shrikanth Narayanan, "Automatic intelligibility classification of sentence-level pathological speech," in Computer Speech and Language, vol. 29, No. 1, pages 132 - 144, 2015 (LINK)
- Jangwon Kim, Asterios Toutios, Sungbok Lee, Shrikanth S. Narayanan, "A kinematic study of critical and non-critical articulators in emotional speech production,'' in The Journal of the Acoustical Society of America, vol. 137, No. 3, pages 1411 - 1429, 2015. (LINK) (PrePrint)
- Jangwon Kim, Adam C. Lammert, Prasanta Kumar Ghosh and Shrikanth S. Narayanan, "Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging" The Journal of the Acoustical Society of America (Express Letter), vol. 135, No. 2, pages EL115-EL121, 2014 (PDF), (Open source MATLAB Toolbox), (Demo webpage)
- Shrikanth Narayanan, Asterios Toutios, Vikram Ramanarayanan, Adam Lammert, Jangwon Kim, Sungbok Lee, Krishna Nayak, Yoon-Chul Kim, Yinghua Zhu, Louis Goldstein, Dani Byrd, Erik Bresch, Prasanta Ghosh, Athanasios Katsamanis, and Michael Proctor, "Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research," The Journal of the Acoustical Society of America (TC), 136(3), pages 1307 - 1311, 2014, (LINK).
Peer-reviewed conference or workshop papers
- Jangwon Kim, Anil Ramakrishna, Sungbok Lee and Shrikanth Narayanan, "Relations between prominence and articulatory-prosodic cues in emotional speech", in Proceedings of Speech Prosody, 2016 (Submitted)
- Rahul Gupta, Theodora Chaspari, Jangwon Kim, Naveen Kumar, Daniel Bone, Shrikanth Narayanan, Pathological speech processing: State-of-the-art, current challenges, and future directions," in Proceedings of ICASSP, 2016 (Accepted)
- Jangwon Kim, Md Nasir, Rahul Gupta, Maarten Van Segbroeck, Daniel Bone, Matthew Black, Zisis Iason Skordilis, Zhaojun Yang, Panayiotis Georgiou and Shrikanth Narayanan, "Automatic estimation of Parkinson's disease severity from diverse speech tasks", in Proceedings of INTERSPEECH, 2015, ISCA, Dresden, Germany (The 2nd place of the Parkinson's condition sub-challenge in INTERSPEECH 2015 computational paralinguistics challenge) (PDF), (ORAL)
- Matthew Black, Daniel Bone, Zisis Iason Skordilis, Rahul Gupta, Wei Xia, Pavlos Papadopoulos, Sandeep Nallan Chakravarthula, Bo Xiao, Maarten Van Segbroeck, Jangwon Kim, Panayiotis Georgiou and Shrikanth Narayanan, "Automated evaluation of non-native English pronunciation quality: combining knowledge- and data-driven features at multiple time scales", in Proceedings of INTERSPEECH, 2015, ISCA, Dresden, Germany (The winner of the degree of nativeness sub-challenge in INTERSPEECH 2015 computational paralinguistics challenge) (PDF)
- Donna Erickson, Jangwon Kim, Shigeto Kawahara, Ian Wilson, Caroline Menezes, Atsuo Suemitsu, Jeff Moore, "Bridging articulation and perception: The C/D model and contrastive emphasis", in 18th International Congress of Phonetic Sciences, 2015, Glasgow, UK, (PDF) (Accepted)
- Naveen Kumar, Rahul Gupta, Tanaya Guha, Maarten Van Segbroeck, Jangwon Kim, Shrikanth S. Narayanan, "Affective Feature Design and Predicting Continuous Affective Dimensions from Music", Emotion in Music Challenge, MediaEval Workshop, 2014, Barcelona, Spain (PDF)
- Jangwon Kim, Sungbok Lee and Shrikanth Narayanan, "Estimation of the movement trajectories of non-crucial articulators based on the detection of crucial moments and physiological constraints" in Proceedings of INTERSPEECH, 2014, ISCA, Singapore, pages 163 - 168 (PDF), (POSTER)
- Jangwon Kim, Donna Erickson, Sungbok Lee and Shrikanth Narayanan, "A study of invariant properties and variation patterns in the Converter/Distributor model for emotional speech" in Proceedings of INTERSPEECH, 2014, ISCA, Singapore, pages 413 - 417, ISCA grant (PDF), (ORAL)
- Maarten Van Segbroeck, Ruchir Travadi, Colin Vaz, Jangwon Kim, Matthew P. Black, Alexandros Potamianos, Shrikanth S. Narayanan, "Classification of cognitive load from speech using an i-vector framework" in Proceedings of INTERSPEECH, 2014, Singapore, pages 751 - 755. (The winner of the cognitive load sub-challenge in INTERSPEECH 2014 speaker trait challenge) (PDF)
- Donna Erickson, Shigeto Kawahara, Jeff Moore, Caroline Menezes, Atsuo Suemitsu, Jangwon Kim and Yoshiho Shibuya "Calculating articulatory syllable duration and prosodic boundaries" in 10-th International Seminar on Speech Production (ISSP), Cologne, Germany, 2014, pages 102-105 (PDF)
- Jangwon Kim, Asterios Toutios, Yoon-Chul Kim, Yinghua Zhu, Sungbok Lee and Shrikanth Narayanan, USC-EMO-MRI corpus: An emotional speech production database recorded by real-time magnetic resonance imaging, in 10-th International Seminar on Speech Production (ISSP), Cologne, Germany, 2014, pages 226-229 (PDF), (POSTER)
- Jangwon Kim, Naveen Kumar, Sungbok Lee and Shrikanth Narayanan, Enhanced airway-tissue boundary segmentation for real-time magnetic resonance imaging data, in 10-th International Seminar on Speech Production (ISSP), Cologne, Germany, 2014, pages 222-225, (the Northen Digital Inc. Excellence Awards) (ORAL) (PDF), (Open source MATLAB Toolbox), (Demo webpage)
- Yoon-Chul Kim, Jangwon Kim, Michael I. Proctor, Asterios Toutios, Krishna S. Nayak, Sungbok Lee and Shrikanth S. Narayanan, "Toward automatic vocal tract area function estimation from accelerated three-dimensional magnetic resonance imaging" in Proceedings of ICSA Workshop on Speech Production in Automatic Speech Recognition (SPASR), Lyon, France, August, 2013 (PDF), (POSTER)
- Ming Li, Adam lammert, Jangwon Kim, Prasanta Kumar Ghosh and Shrikanth S. Narayanan, "Automatic classification of palatal and pharyngeal wall shape categories from speech acoustics and inverted articulatory signals" in Proceedings of ICSA Workshop on Speech Production in Automatic Speech Recognition (SPASR), Lyon, France, August, 2013 (PDF)
- Ming Li, Jangwon Kim, Prasanta Kumar Ghosh, Vikram Ramanarayanan and Shrikanth S. Narayanan, "Speaker verification based on fusion of acoustic and articulatory information," in Proceedings of INTERSPEECH, Lyon, France, August, 2013, page 1614-1618 (PDF)
- Jangwon Kim, Adam Lammert, Prasanta Ghosh and Shrikanth S. Narayanan, "Spatial and temporal alignment of multimodal human speech production data: Real time imaging, flesh point tracking and audio," in Proceedings of ICASSP, Vancouver, Canada, May, 2013, pages 3637 - 3641 (PDF), (Open source MATLAB Toolbox), (Demo webpage)
- Jangwon Kim, Prasanta Ghosh, Sungbok Lee and Shrikanth S. Narayanan, "A study of emotional information present in articulatory movements estimated using acoustic-to-articulatory inversion," in Proceedings of Asia Pacific Signal and Information Processing Association (APSIPA), IEEE, Los Angeles, USA, December, 2012, pages 1 - 4 (PDF)
- Jangwon Kim, Naveen Kumar, Andreas Tsiartas and Shrikanth Narayanan, "Intelligibility classification of pathological speech using fusion of multiple high level descriptors," in Proceedings of INTERSPEECH, 2012, Portland, USA, pages 534 - 537. (The winner of the pathological speech sub-challenge in INTERSPEECH 2012 speaker trait challenge) (PDF)
- Jangwon Kim Sungbok Lee and Shrikanth S. Narayanan, "An exploratory study of the relations between perceived emotion strength and articulatory kinematics," in Proceedings of INTERSPEECH, 2011, ISCA, Florence, Italy, pages 2961 - 2964 (PDF)
- Jangwon Kim, Sungbok Lee and Shrikanth S. Narayanan, "A study of interplay between articulatory movement and prosodic characteristics in emotional speech production," in Proceedings of InterSpeech, 2010, ISCA, Makuhari, Japan, pages 1173 - 1176. (PDF)
- Jangwon Kim, Sungbok Lee and Shrikanth S. Narayanan, "An exploratory study of manifolds of emotional speech," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2010, IEEE, Dallas, USA, pages 5142 - 5145. (PDF)
- Jangwon Kim, Sungbok Lee and Shrikanth S. Narayanan, "A detailed study of word-position effects on emotion expression in speech," in Proceedings of InterSpeech, 2009, ISCA, Brighton, United Kingdom, pages 1987 - 1990. (PDF)
- Sungbok Lee, Jangwon Kim and Shrikanth S. Narayanan, "On the Interactions among Speech Parameters across Emotions and Speakers in Emotional Speech Production" in 10-th International Seminar on Speech Production (ISSP), Cologne, Germany, 2014
- Jangwon Kim, Adam Lammert, Michael Proctor and Shrikanth S. Narayanan, "Co-registraction of articulographic and real-time magnetic resonance imaging data for multimodal analysis of rapid speech," in 164th Meeting of the Acoustical Society of America, Kansas City, Missouri, October, 2012 (POSTER)
- Jangwon Kim, Sungbok Lee and Shrikanth S. Narayanan, "Detailed study of articulatory kinematics of critical articulators and dependent articulators of emotional speech," in 162nd Meeting of the Acoustical Society of America, San Diego, California, November, 2011 (POSTER)
- Sungbok Lee, Jangwon Kim and Dani Byrd, "Emotion effects on speech articulation: local or global?," in 162nd Meeting of the Acoustical Society of America, San Diego, California, November, 2011
- Speech-to-Speech (S2S) Systems (NSF)
- Building robust, widely-deployable and cost-effective solutions (S2S systems) enhancing cross-lingual spoken language interaction
- Developing parametric speech synthesis systems for English and Mexican Spanish
- Emotional speech production (NSF)
- Analysis, modeling, and synthesis of emotional speech production system.
- Data acquisition and processing for emotional speech using real-time Magnetic Resonance Imaging (MRI) technology and ElectroMagnetic Articulography (EMA)
- Analysis of prosody, articulatory movement, and their interrelationship for emotion expression
- Computational modeling of emotional speech production system and simulation
- SAIL Pipeline
- On-line recognition system of emotion from speech signal.
- Porting i-vector extraction from KALDI to Barista.
- Model training and system evaluation using KALDI (off-line) and Barista implementation (on-line).
- On-line speaker diarization system from speech signal
- Development of a on-line diarization module based on KALDI
- Model training and system evaluation
Selected Graduate Coursework
Estimation Theory (EE563)
Optimization: Theory and Algorithms (ISE520)
Probabilistic Reasoning (CSCI573)
Affective Computing (CSCI534)
Applied Mathematics for Engineers (MATH570A)
Statistics for Engineers (EE517)
Mathematical Pattern Recognition (EE559)
Random Processes in Engineering (EE562a)
Speech Recognition and Processing for Multimedia (EE519)
Advanced Topics in Automatic Speech Recognition (EE619)
Probability Theory for Engineers (EE464)
General Phonetics (Ling580)
Magnetic Resonance Imaging and Reconstruction (EE591)
Programming Python, C/C++, Linux bash script, LaTeX
Tools MATLAB, SPSS, HTK (HMM Toolkit), KALDI (ASR toolkit), HTS (HMM based TTS Toolkit), Festival (Speech Synthesis System), Edinburgh Speech Tools, Praat script
Language Korean (Native), English (Fluent)
- Volunteer of APSIPA Annual Summit and Conference (December 3-6 2012)
- Coordinator of US-Korea industry forum at US-Korea Conference (UKC) 2012 (October 2012)
- President of Yonsei university alumni at the USC (Fall 2011 - Spring 2012)
- Member of IEEE Signal Processing Society
- Member of Acoustical Society of America
- Qualcomm IT Tour (August 2008)
- Volunteer of 17th International Federation of Automatic Control (IFAC) World Congress (2008)
- Q-riosity (National Mobile Prosumer Society of Qualcomm Korea): Designed and led an outreach program which provided computer education service to seniors in Baekhak-maeul, Yeoncheon-gun, Gyenggi-do, South Korea (Winter, 2008)
- Research and Markets: Affective computing market ... (LINK)
- USC Viterbi team sails to fourth straight win (LINK)
- USC SAIL win Interspeech 2012 speaker trait challenge award (LINK)
Updated on Sep. 13th 2015