All publications
2010
| , and , Towards modeling user behavior in interactions mediated through an automated bidirectional speech translation system (2010), in: Computer Speech and Language, 24:2(232-256) |
|
2009
| , and , A detailed study of word-position effects on emotion expression in speech, in: Proceedings of InterSpeech, 2009 |
|
| and , A divide-and-conquer approach to latent perceptual indexing of audio for large web 2.0 applications, in: Proceedings of the International Conference on Multimedia & Expo (ICME), pages 466-469, 2009 |
|
| , and , A low-complexity dynamic face-voice feature fusion approach to multimodal person recognition, in: Proceedings of the IEEE International Symposium on Multimedia (ISM), 2009 |
|
| , , and , A review of ASR technologies for children’s speech, in: Proceedings of the Workshop on Child, Computer and Interaction, 2009 |
|
| , and , A robust harmony structure modeling scheme for classical music opus identification, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 1961-1964, 2009 |
|
| , and , A semi-supervised learning approach to online audio background detection, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 1629-1632, 2009 |
|
| , and , Accelerated 3D MRI of vocal tract shaping using compressed sensing and parallel imaging, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 389-392, 2009 |
|
| , and , Accelerated 3D upper airway MRI using compressed sensing (2009), in: Magnetic Resonance in Medicine, 61:6(1434-1440) |
|
| , and , Acoustic topic model for audio information retrieval, in: Proceedings of the Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2009 |
|
| , and , An analysis of articulatory-acoustic data based on articulatory strokes, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 4493-4496, 2009 |
|
| , , , , and , An articulatory analysis of phonological transfer using real-time MRI, in: Proceedings of InterSpeech, 2009 |
|
| , , , and , An articulatory study of lexicalized and epenthetic schwa using real time magnetic resonance imaging, in: Proceedings of the Meeting of the Acoustical Society of America, 2009 |
|
| , , and , An iterative relative entropy minimization based data selection approach for n-gram model adaptation (2009), in: IEEE Transactions on Audio, Speech, and Language Processing, 17:1(13-23) |
|
| , and , Analysis of emotionally salient aspects of fundamental frequency for emotion detection (2009), in: IEEE Transactions on Audio, Speech, and Language Processing, 17:4(582-596) |
|
| , , , and , Analysis of pausing behavior in spontaneous speech using real-time magnetic resonance imaging of articulation (2009), in: Journal of the Acoustical Society of America Express Letters, 126:5(EL160-EL165) |
|
| , , , , , , , , , and , Assessment of emerging reading skills in young native speakers and language learners (2009), in: Speech Communication, 51:10(968–984) |
|
| and , Automatic detection of disfluency boundaries in spontaneous speech of children using audio-visual information (2009), in: IEEE Transactions on Audio, Speech, and Language Processing, 17:1(2-12) |
|
| , , , and , Automatic pronunciation verification of English letter-names for early literacy assessment of preliterate children, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 |
|
| , , and , Automatically rating pronunciation through articulatory phonology, in: Proceedings of InterSpeech, 2009 |
|
| and , Closure duration analysis of incomplete stop consonants due to stop-stop interaction (2009), in: Journal of the Acoustical Society of America, 126:1(EL1-EL7) |
|
| , and , Combining lexical, syntactic and prosodic cues for improved online dialog act tagging (2009), in: Computer Speech and Language, 23:4(407-422) |
|
| , , and , Comparison of child-human and child-computer interactions based on manual annotations, in: Proceedings of the Workshop on Child, Computer and Interaction, 2009 |
|
| , , , and , Connecting rhythm and prominence in automatic ESL pronunciation scoring, in: Proceedings of InterSpeech, 2009 |
| , , and , Context-driven automatic bilingual movie subtitle alignment, in: Proceedings of InterSpeech, 2009 |
|
| and , Continuous speech recognition using attention shift decoding with soft decision, in: Proceedings of InterSpeech, 2009 |
|
| , , , , , , and , Differentiating physical activity modalities in youth using heartbeat waveform shape and differences between adjacent waveforms, in: Proceedings of the International Conference on Diet and Activity Methods (ICDAM), 2009 |
| and , Discriminative wavelet packet filter bank selection for pattern recognition (2009), in: IEEE Transactions on Signal Processing, 57:5(1796-1810) |
|
| , and , Effect of bandwidth extension to telephone speech recognition in cochlear implant users (2009), in: Journal of the Acoustical Society of America, 125:2(EL77-EL83) |
|
| , , , and , Emotion recognition using a hierarchical binary decision tree approach, in: Proceedings of InterSpeech, 2009 |
|
| , , , , , and , Energy-efficient multihypothesis activity-detection for health-monitoring applications, in: Proceedings of the Annual International IEEE Engineering in Medicine and Biology Society (EMBS) Conference, 2009 |
|
| , and , Environmental sound recognition with time–frequency audio features (2009), in: IEEE Transactions on Audio, Speech, and Language Processing, 17:6(1142-1158) |
[URL] |
| , , , and , Estimation of articulatory gesture patterns from speech acoustics, in: Proceedings of InterSpeech, 2009 |
|
| , and , Evaluating evaluators: A case study in understanding the benefits and pitfalls of multi-evaluator modeling, in: Proceedings of InterSpeech, 2009 |
|
| and , Histogram-based estimation for the divergence revisited, in: Proceedings of the IEEE International Symposium on Information Theory (ISIT), pages 468-472, 2009 |
|
| , and , Human perception of audio-visual synthetic character emotion expression in the presence of ambiguous and conflicting information (2009), in: IEEE Transactions on Multimedia, 11:5(843-855) |
|
| and , Human-centric interfaces for ambient intelligence, in: Speech Synthesis Systems in Ambient Intelligence Environments, Elsevier, 2009 |
| and , Improved speaker diarization of meeting speech with recurrent selection of representative speech segments and participant interaction pattern modeling, in: Proceedings of InterSpeech, 2009 |
|
| , , , , , and , Interpreting ambiguous emotional expressions, in: Proceedings of the International Conference on Affective Computing and Intelligent Interaction (ACII), 2009 |
|
| , and , Lattice-based lexical cues for word fragment detection in conversational speech, in: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2009 |
|
| , , and , Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions, in: Proceedings of InterSpeech, 2009 |
|
| , , and , Multimodal speaker segmentation and identification in presence of overlapped speech segments (2009), in: Journal of Multimedia, Special Issue on Data Semantics and Multimedia Information Management |
| , , , , , , and , Optimal allocation of time-resources for multihypothesis activity-level detection, in: Proceedings of the International Conference on Distributed Computing in Sensor Systems (DCOSS), pages 273-286, 2009 |
|
| , , , , , , and , Optimal time-resource allocation for activity-detection via multimodal sensing, in: Proceedings of the International Conference on Body Area Networks (BodyNets), 2009 |
|
| and , Pitch contour stylization using an optimal piecewise polynomial approximation (2009), in: IEEE Signal Processing Letters, 16:9(810-813) |
|
| , , and , Predicting children’s reading ability using evaluator-informed features, in: Proceedings of InterSpeech, 2009 |
|
| and , Prominence detection using auditory attention cues and task-dependent high level information (2009), in: IEEE Transactions on Audio, Speech, and Language Processing, 17:5(1009-1024) |
|
| , , , and , Real-time MRI tracking of articulation during grammatical and ungrammatical pauses in speech, in: Proceedings of the Meeting of the Acoustical Society of America, 2009 |
|
| and , Recognizing child’s emotional state in problem-solving child-machine interactions, in: Proceedings of the Workshop on Child, Computer and Interaction, 2009 |
|
| and , Region segmentation in the frequency domain applied to upper airway real-time magnetic resonance images (2009), in: IEEE Transactions on Medical Imaging, 28:3(323-338) |
[URL] |
| , , and , Robust word boundary detection in spontaneous speech using acoustic and lexical cues, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 4785-4788, 2009 |
|
| , and , Saliency-driven unstructured acoustic scene classification using latent perceptual indexing, in: Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), 2009 |
| , , , , , , and , Sensing for obesity: KNOWME implementation and lessons for an architect, in: Proceedings of the Workshop on Biomedicine in Computing: Systems, Architectures, and Circuits (BiC), 2009 |
|
| and , Signature cluster model selection for incremental Gaussian mixture cluster modeling in agglomerative hierarchical speaker clustering, in: Proceedings of InterSpeech, 2009 |
|
| and , Speech synthesis systems in ambient intelligence environments, in: Human-Centric Interfaces For Ambient Intelligence, Elsevier, 2009 |
| , , and , Timing effects of syllable structure and stress on nasals: A real-time MRI examination (2009), in: Journal of Phonetics, 37:1(97-110) |
|
| and , Unsupervised adaptation of categorical prosody models for prosody labeling and speech recognition (2009), in: IEEE Transactions on Audio, Speech, and Language Processing, 17:1(138-149) |
|
2008
| , and , A generative model for scoring children's reading comprehension, in: Proceedings of the Workshop on Child, Computer and Interaction, 2008 |
|
| and , A novel algorithm for unsupervised prosodic language model adaptation, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 4181-4184, 2008 |
|
| and , A novel inter-cluster distance measure combining GLR and ICR for improved agglomerative hierarchical speaker clustering, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 4373-4376, 2008 |
|
| and , A top-down auditory attention model for learning task dependent influences on prominence detection in speech, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 3981-3984, 2008 |
|
| and , Agglomerative hierarchical speaker clustering using incremental Gaussian mixture cluster modeling, in: Proceedings of InterSpeech, pages 20-23, 2008 |
|
| , and , An analysis of multimodal cues of interruption in dyadic spoken interactions, in: Proceedings of InterSpeech, pages 1678-1681, 2008 |
|
| , , , , and , An analysis of vocal tract shaping in English sibilant fricatives using real-time magnetic resonance imaging, in: Proceedings of InterSpeech, pages 2823-2826, 2008 |
|
| , and , An empirical analysis of user uncertainty in problem-solving child-machine interactions, in: Proceedings of the Workshop on Child, Computer and Interaction, 2008 |
|
| , and , An interval type-2 fuzzy logic system to translate between emotion-related vocabularies, in: Proceedings of InterSpeech, pages 2747-2750, 2008 |
|
| and , Audio retrieval by latent perceptual indexing, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 49-52, 2008 |
|
| , and , Audio-visual emotion recognition using Gaussian mixture models for face and voice, in: Proceedings of the IEEE International Symposium on Multimedia (ISM), pages 250-257, 2008 |
|
| , and , Automatic classification of question turns in spontaneous speech using lexical and prosodic evidence, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5005-5008, 2008 |
|
| and , Automatic prosodic event detection using acoustic, lexical, and syntactic evidence (2008), in: IEEE Transactions on Audio, Speech, and Language Processing, 16:1(216-228) |
|
| and , Better nonnative intonation scores through prosodic theory, in: Proceedings of InterSpeech, pages 1813-1816, 2008 |
|
| , , and , Challenging uncertainty in query by humming systems: A fingerprinting approach (2008), in: IEEE Transactions on Audio, Speech, and Language Processing, 16:2(359-371) |
|
| and , Classification of sound clips by two schemes: using onomatopoeia and semantic labels, in: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pages 1341-1344, 2008 |
|
| and , Combining task-dependent information with auditory attention cues for prominence detection in speech, in: Proceedings of InterSpeech, pages 1064-1067, 2008 |
|
| and , Data-driven unsupervised adaptation of acoustic-prosodic models, in: Proceedings of the Speech Prosody Conference, pages 161-164, 2008 |
|
| , , and , Detecting prominence in conversational speech: Pitch accent, givenness and focus, in: Proceedings of the Speech Prosody Conference, pages 453-456, 2008 |
|
| and , Dynamic chroma feature vectors with applications to cover song identification, in: Proceedings of the International Workshop on Multimedia Signal Processing (MMSP), pages 984-987, 2008 |
|
| , , and , Effect of spectral normalization on different talker speech recognition by cochlear implant users (2008), in: Journal of the Acoustical Society of America, 123:5(2836-2847) |
|
| , and , Enriching spoken language translation with dialog acts, in: Proceedings of the Association for Computational Linguistics (ACL) Conference, pages 225-228, 2008 |
|
| , and , Environmental sound recognition using MP-based features, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 1-4, 2008 |
[URL] |
| , , and , Estimation of children's reading ability by fusion of automatic pronunciation verification and fluency detection, in: Proceedings of InterSpeech, pages 2779-2782, 2008 |
|
| , and , Exploiting acoustic and syntactic features for automatic prosody labeling in a maximum entropy framework (2008), in: IEEE Transactions on Audio, Speech, and Language Processing, 16:4(797-811) |
|
| , and , Factored translation models for enriching spoken language translation with prosody, in: Proceedings of InterSpeech, pages 2723-2726, 2008 |
|
| and , Fine-grained pitch accent and boundary tone labeling with parametric F0 features, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 4545-4548, 2008 |
|
| , , and , Fundamental frequency analysis for speech emotion processing, in: The Role of Prosody in Affective Speech, pages 309-337, Peter Lang Publishing Group, 2008 |
| , , and , Human perception of synthetic character emotions in the presence of conflicting and congruent vocal and facial expressions, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 2201-2204, 2008 |
|
| , , , , , , , and , IEMOCAP: Interactive emotional dyadic motion capture database (2008), in: Journal of Language Resources and Evaluation, 42:4(335-359) |
|
| and , Investigating automatic assessment of reading comprehension in young children, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5057-5060, 2008 |
|
| , , and , Joint-processing of audio-visual signals in human perception of conflicting synthetic character emotions, in: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pages 961-964, 2008 |
|
| , and , Knowledge as a constraint on uncertainty for unsupervised classification: A study in part-of-speech tagging, in: Proceedings of the International Conference on Machine Learning (ICML), 2008 |
|
| , and , Linguistic analysis of spontaneous children speech, in: Proceedings of the Workshop on Child, Computer and Interaction, 2008 |
|
| , and , Mitigation of data sparsity in classifier-based translation, in: Proceedings of the International Conference on Computational Linguistics (COLING), pages 1-4, 2008 |
|
| , and , Modeling the intonation of discourse segments for improved online dialog act tagging, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5033-5036, 2008 |
|
| , , , , , , , , and , Multimodal sensing for pediatric obesity applications, in: Proceedings of the International Workshop on Urban, Community, and Social Applications of Networked Sensing Systems (UrbanSense), pages 21-25, 2008 |
|
| , , and , Multimodal speaker segmentation in presence of overlapped speech segments, in: Proceedings of the IEEE International Symposium on Multimedia (ISM), pages 679-684, 2008 |
|
| , and , Music fingerprint extraction for classical music cover song identification, in: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pages 1261-1264, 2008 |
|
| , and , On energy-based acoustic source localization for sensor networks (2008), in: IEEE Transactions on Signal Processing, 56:1(365-377) |
|
| and , On the robustness of overall F0-only modifications to the perception of emotions in speech (2008), in: Journal of the Acoustical Society of America, 123:6(4547-4558) |
|
| , , , and , Pronunciation verification of English letter-sounds in preliterate children, in: Proceedings of InterSpeech, pages 2783-2786, 2008 |
|
| , and , Recognition for synthesis: Automatic parameter selection for resynthesis of emotional speech from neutral speech, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 4629-4632, 2008 |
|
| and , Recording audio-visual emotional databases from actors: A closer look, in: Proceedings of the International Conference on Language Resources and Evaluation (LREC), pages 17-22, 2008 |
|
| , and , Relation between geometry and kinematics of articulatory trajectory associated with emotional speech production, in: Proceedings of InterSpeech, pages 2290-2293, 2008 |
|
| and , Scripted dialogs versus improvisation: Lessons learned about emotional elicitation techniques from the IEMOCAP database, in: Proceedings of InterSpeech, pages 1670-1673, 2008 |
|
| , , , and , Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging (2008), in: IEEE Signal Processing Magazine, 25:3(123-132) |
|
| , and , Selection of emotionally salient audio-visual features for modeling human evaluations of synthetic character emotion displays, in: Proceedings of the IEEE International Symposium on Multimedia (ISM), pages 190-195, 2008 |
|
| , and , Strategies to improve the robustness of agglomerative hierarchical clustering under data source variation for speaker diarization (2008), in: IEEE Transactions on Audio, Speech, and Language Processing, 16:8(1590-1601) |
|
| and , The expression and perception of emotions: Comparing assessments of self versus others, in: Proceedings of InterSpeech, pages 257-260, 2008 |
|
| , and , The SAIL speaker diarization system for analysis of spontaneous meetings, in: Proceedings of the International Workshop on Multimedia Signal Processing (MMSP), pages 966-971, 2008 |
|
| , and , The Vera am Mittag German audio-visual emotional speech database, in: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pages 865-868, 2008 |
|
| , and , Towards unsupervised training of the classifier-based speech translator, in: Proceedings of InterSpeech, pages 2739-2742, 2008 |
|
| and , Tree grammars as models of prosodic structure, in: Proceedings of InterSpeech, pages 2286-2289, 2008 |
|
| and , Using articulatory representations to detect segmental errors in nonnative pronunciation (2008), in: IEEE Transactions on Audio, Speech, and Language Processing, 16:1(8-22) |
|
2007
| , , , , , , , and , A Bayesian network classifier for word-level reading assessment, in: Proceedings of InterSpeech, pages 2185–2188, 2007 |
|
| and , A robust stopping criterion for agglomerative hierarchical clustering in a speaker diarization system, in: Proceedings of InterSpeech, pages 1853-1856, 2007 |
|
| and , A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech, in: Proceedings of InterSpeech, pages 1941-1944, 2007 |
|
| , and , A statistical approach for modeling prosody features using POS tags for emotional speech synthesis, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 1237-1240, 2007 |
|
| , , , , , , , , , , , , and , A system for technology based assessment of language and literacy in young children: The role of multiple information sources, in: Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), pages 26-30, 2007 |
|
| , and , A text-free approach to assessing nonnative intonation, in: Proceedings of InterSpeech, pages 2169-2172, 2007 |
|
| and , An acoustic measure for word prominence in spontaneous speech (2007), in: IEEE Transactions on Audio, Speech, and Language Processing, 15:2(690-701) |
|
| and , Analysis of audio clustering using word descriptions, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 769-772, 2007 |
|
| , and , Analysis of emotional speech prosody in terms of part of speech tags, in: Proceedings of InterSpeech, pages 626-629, 2007 |
|
| , and , Analyzing the multimodal behaviors of users of a speech-to-speech translation device by using concept matching scores, in: Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), pages 259-263, 2007 |
|
| and , Automatic acoustic synthesis of human-like laughter (2007), in: Journal of the Acoustical Society of America, 121:1(527-535) |
|
| , , , and , Automatic detection and classification of disfluent reading miscues in young children's speech for the purpose of assessment, in: Proceedings of InterSpeech, pages 206-209, 2007 |
|
| , and , Data driven approach for language model adaptation using stepwise relative entropy minimization, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 177-180, 2007 |
|
| and , Discriminating two types of noise sources using cortical representation and dimension reduction technique, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 213-216, 2007 |
|
| and , Early auditory processing inspired features for robust automatic speech recognition, in: Proceedings of European Signal Processing Conference (EUSIPCO), 2007 |
|
| and , Experiments in automatic genre classification of full-length music tracks using audio activity rate, in: Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), pages 98-102, 2007 |
|
| , and , Exploiting acoustic and syntactic features for prosody labeling in a maximum entropy framework, in: Proceedings of the Human Language Technologies (HLT) Conference, pages 797–811, 2007 |
|
| , and , Exploiting prosodic features for dialog act tagging in a discriminative modeling framework, in: Proceedings of InterSpeech, pages 150-153, 2007 |
|
| and , Improved speech recognition using acoustic and lexical correlates of pitch accent in a N-best rescoring framework, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 873-876, 2007 |
|
| , , and , Information theoretic analysis of direct articulatory measurements for phonetic discrimination, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 457-460, 2007 |
|
| and , Interrelation between speech and facial gestures in emotional utterances: A single subject study (2007), in: IEEE Transactions on Audio, Speech, and Language Processing, 15:8(2331-2347) |
|
| , , and , Investigating implicit cues for user state estimation in human-robot interaction using physiological measurements, in: Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pages 1125-1130, 2007 |
|
| and , Joint analysis of the emotional fingerprint in the face and speech: A single subject study, in: Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), pages 43-47, 2007 |
|
| , , and , Learning expressive human-like head motion sequences from speech, in: Data-Driven 3D Facial Animations, pages 113-131, Springer-Verlag Press, 2007 |
|
| and , Minimum probability of error signal representation, in: Proceedings of the IEEE Machine Learning for Signal Processing (MLSP) Workshop, pages 348-353, 2007 |
|
| , , and , Multimodal meeting monitoring: Improvements on speaker tracking and segmentation through a modified mixture particle filter, in: Proceedings of IEEE International Workshop on Multimedia Signal Processing (MMSP), pages 60-65, 2007 |
|
| and , Optimal wavelet packets decomposition based on a rate-distortion optimality criterion, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 817-820, 2007 |
|
| , and , Pitch period estimation using multipulse model and wavelet transform, in: Proceedings of InterSpeech, pages 2761-2764, 2007 |
|
| , , and , Primitives-based evaluation and estimations of emotions in speech (2007), in: Speech Communication, 49:10-11(787-800) |
|
| and , Prosody-enriched lattices for improved syllable recognition, in: Proceedings of InterSpeech, pages 1813-1816, 2007 |
|
| , , and , Real-time emotion detection system using speech: Multi-modal fusion of different timescale features, in: Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), pages 48-51, 2007 |
|
| , and , Real-time monitoring of participants' interaction in a meeting using audio-visual sensors, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 685-688, 2007 |
|
| , , , and , Rigid head motion in expressive speech animation: Analysis and synthesis (2007), in: IEEE Transactions on Audio, Speech, and Language Processing, 15:3(1075-1086) |
|
| , and , Robust speaker clustering strategies to data source variation for improved speaker diarization, in: Proceedings of the IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, pages 262-267, 2007 |
|
| and , Robust speaker identification based on selective use of feature vectors (2007), in: Pattern Recognition Letters, 28:1(85-89) |
|
| and , Robust speech rate estimation for spontaneous speech (2007), in: IEEE Transactions on Audio, Speech, and Language Processing, 15:8(2190-2201) |
|
| , , and , Statistical modeling and retrieval of polyphonic music, in: Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), pages 405-409, 2007 |
|
| , and , Support vector regression for automatic recognition of spontaneous emotions in speech, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 1085-1088, 2007 |
|
| and , Universal consistency of data-driven partitions for divergence estimation, in: Proceedings of the IEEE International Symposium on Information Theory (ISIT), pages 2021-2025, 2007 |
|
| , and , Using neutral speech models for emotional speech analysis, in: Proceedings of InterSpeech, pages 2225–2228, 2007 |
|
2006
| , and , "Yeah right": Sarcasm recognition for spoken dialogue systems, in: Proceedings of InterSpeech, pages 1838-1841, 2006 |
|
| , , , and , A dictionary based approach for robust and syllable-independent audio input transcription for query by humming systems, in: Proceedings of the Audio and Music Computing for Multimedia (AMCMM) Workshop, pages 37-44, 2006 |
|
| , and , A split lexicon approach for improved recognition of spoken names (2006), in: Speech Communication, 48:9(1126-1136) |
|
| , , , and , A study of emotional speech articulation using a fast magnetic resonance imaging technique, in: Proceedings of InterSpeech, 2006 |
|
| , and , Acoustic analysis and automatic recognition of spontaneous children's speech, in: Proceedings of InterSpeech, 2006 |
|
| , and , Acoustic-syntactic maximum entropy model for automatic prosody labeling, in: Proceedings of the IEEE/ACL 2006 Workshop on Spoken Language Technology, pages 74-77, 2006 |
|
| and , An attribute-based approach to audio description applied to segmenting vocal sections in popular music songs, in: Proceedings of the International Workshop on Multimedia Signal Processing (MMSP), pages 103-107, 2006 |
|
| , , and , An English-Persian automatic speech translator: Recent developments in domain portability and user modeling, in: Proceedings of the International Conference on Intelligent Systems and Computing (ISYC), 2006 |
|
| , and , An exploratory study of emotional speech production using functional data analysis techniques, in: Proceedings of the International Seminar on Speech Production (ISSP), pages 11-17, 2006 |
|
| and , Analysis of disfluent repetitions in spontaneous speech recognition, in: Proceedings of the European Signal Processing Conference (EUSIPCO), 2006 |
|
| , , and , Analyzing children's speech: An acoustic study of consonants and consonant-vowel transition, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 393-396, 2006 |
|
| , , , , , and , Automatic detection of voice onset time contrasts for use in pronunciation assessment, in: Proceedings of InterSpeech, 2006 |
|
| and , Average divergence distance as a statistical discrimination measure for hidden Markov models (2006), in: IEEE Transactions on Audio, Speech, and Language Processing, 14:3(890-906) |
|
| and , Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labeling, in: Proceedings of InterSpeech, pages 297–300, 2006 |
|
| , , and , Combining categorical and primitives-based emotion recognition, in: Proceedings of the European Signal Processing Conference (EUSIPCO), 2006 |
|
| , and , Content analysis for acoustic environment classification in mobile robots, in: Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI) Fall Symposium, 2006 |
|
| , and , Cross-lingual dialog model for speech to speech translation, in: Proceedings of InterSpeech, 2006 |
|
| and , Detection of non-native named entities using prosodic features for improved speech recognition and translation, in: Proceedings of the International Speech Communication Association (ISCA) Multiling Workshop, 2006 |
|
| , and , Efficient rotation invariant retrieval of shapes using dynamic time warping with applications in medical databases, in: Proceedings of the IEEE International Symposium on Computer-Based Medical Systems (CBMS), pages 673-678, 2006 |
|
| , and , Efficient scalable encoding for distributed speech recognition (2006), in: Speech Communication, 48:8(888-902) |
|
| , , , , and , Expressive facial animation synthesis by learning speech coarticulation and expression spaces (2006), in: IEEE Transactions on Visualization and Computer Graphics, 12:6(1523-1534) |
[URL] |
| and , Interplay between linguistic and affective goals in facial expression during emotional utterances, in: Proceedings of the International Seminar on Speech Production (ISSP), pages 549-556, 2006 |
|
| , and , Modeling emotion expression and perception behavior in auditive emotion evaluation, in: Proceedings of the International Conference on Speech Prosody, pages 9-12, 2006 |
|
| , , and , Not all errors are created equal: Pedagogical contextualization of language learner speech errors, in: Proceedings of the The Computer Assisted Language Instruction Consortium (CALICO), 2006 |
| , and , Pathological voice assessment, in: Proceedings of the IEEE Engineering in Medicine and Biology Society (EMBS) Annual International Conference, 2006 |
|
| , , , , , and , Pronunciation verification of children's speech for automatic literacy assessment, in: Proceedings of InterSpeech, 2006 |
|
| , , , , , and , Radiobot-CFF: A spoken dialogue system for military training, in: Proceedings of InterSpeech, 2006 |
|
| , , and , Robust recognition and assessment of non-native speech variability, in: Proceedings of the International Conference on Intelligent Systems And Computing (ISYC), 2006 |
|
| , and , Selecting relevant text subsets from web-data for building topic specific language models, in: Proceedings of the Human Language Technologies (HLT) Conference, pages 145-148, 2006 |
|
| , , , , and , Semi-automatic processing of real-time MR image sequences for speech production studies, in: Proceedings of the International Seminar on Speech Production (ISSP), pages 427-434, 2006 |
|
| , and , Smooth GMM based multi-talker spectral conversion for spectrally degraded speech, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 141-144, 2006 |
|
| , and , Speaker and listener variations in emotion assessment, in: Proceedings of the German Annual Meeting of Acoustics (DAGA), pages 335-336, 2006 |
|
| , , , , , , , , , , , , , , , , and , Speech recognition engineering issues in speech to speech translation system design for low resource languages and domains, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2006 |
|
| , , and , Synchronized and noise-robust audio recordings during realtime magnetic resonance imaging scans (2006), in: Journal of the Acoustical Society of America, 120:4(1791-1794) |
|
| , and , Text data acquisition for domain-specific language models, in: Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP), pages 382-389, 2006 |
|
| , , , , and , Text-independent voice conversion based on unit selection, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 81-84, 2006 |
|
| and , Upper bound Kullback-Leibler divergence for hidden Markov models with application as discrimination measure for speech recognition, in: Proceedings of the IEEE International Symposium on Information Theory (ISIT), pages 2299-2303, 2006 |
|
| , and , User modeling in a speech translation driven mediated interaction setting, in: Proceedings of the International Workshop on Human-Centered Multimedia (HCM), pages 75-80, 2006 |
|
| , and , Using model trees for evaluating dialog error conditions based on acoustic speech Information, in: Proceedings of the International Workshop on Human-Centered Multimedia (HCM), 2006 |
|
| and , Vector-based representation and clustering of audio using onomatopoeia words, in: Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI) Fall Symposium, 2006 |
|
| , , and , Where am I? Scene recognition for mobile robots using audio features, in: Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), pages 885-888, 2006 |
|
2005
| , and , Adaptive categorical understanding for spoken dialog systems (2005), in: IEEE Transactions on Speech and Audio Processing, 13:3(321-329) |
|
| , , and , An articulatory study of emotional speech production, in: Proceedings of InterSpeech, pages 497-500, 2005 |
|
| and , An automatic prosody recognizer using a coupled multi-stream acoustic model and a syntactic-prosodic language model, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 269-272, 2005 |
|
| and , An unsupervised quantitative measure for word prominence in spontaneous speech, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 377-380, 2005 |
|
| , and , Automatic diacritization of Arabic transcripts for automatic speech recognition, in: Proceedings of the International Conference on Natural Language Processing (ICON), 2005 |
|
| and , Automatic syllable stress detection using prosodic features for pronunciation evaluation of language learners, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 937-940, 2005 |
|
| , and , Building topic specific language models from webdata using competitive models, in: Proceedings of InterSpeech, pages 1293-1296, 2005 |
|
| , , , and , Creating data resources for designing usercentric frontends for query-by-humming systems (2005), in: ACM Multimedia Systems Journal, Special Issue on Music Information Retrieval, 10:6(475-483) |
|
| , , , and , Detecting politeness and frustration state of a child in a conversational computer game, in: Proceedings of InterSpeech, pages 2209-2212, 2005 |
|
| and , Distributed range difference based target localization in sensor network, in: Proceedings of the Asilomar Conference on Signals, Systems and Computers, pages 205-209, 2005 |
|
| and , Hidden-articulator Markov models for pronunciation evaluation, in: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pages 174-179, 2005 |
|
| , , , , , and , Investigating the role of phoneme-level modifications in emotional speech resynthesis, in: Proceedings of InterSpeech, pages 801-804, 2005 |
|
| , , and , Modeling and automating detection of errors in Arabic language learner speech, in: Proceedings of InterSpeech, pages 177-180, 2005 |
|
| , and , Multichannel audio synthesis by subband-based spectral conversion and parameter adaptation (2005), in: IEEE Transactions on Speech and Audio Processing, 13:2(263-274) |
|
| , , and , Natural head motion synthesis driven by acoustic prosodic features (2005), in: Journal of Computer Animation and Virtual Worlds, 16:3-4(283-290) |
|
| , , and , Natural head motion synthesis driven by acoustic prosodic features, in: Proceedings of the Computer Animation and Social Agents, 2005 |
| and , Piecewise linear stylization of pitch via wavelet analysis, in: Proceedings of InterSpeech, pages 3277–3280, 2005 |
|
| , , and , Pronunciation variations of Spanish-accented English spoken by young children, in: Proceedings of InterSpeech, pages 749–752, 2005 |
|
| , , , , , , and , Smart room: Participant and speaker localization and identification, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 1117-1120, 2005 |
|
| and , Speech rate estimation via temporal correlation and selected sub-band correlation, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 413-416, 2005 |
|
| , , , , , , , , and , TBALL data collection: The making of a young children's speech corpus, in: Proceedings of InterSpeech, pages 1581-1584, 2005 |
|
| and , Toward detecting emotions in spoken dialogs (2005), in: IEEE Transactions on Speech and Audio Processing, 13:2(293-303) |
|
| , and , Towards parameter-free classification of sound effects in movies, in: Proceedings of the SPIE Optics and Photonics Symposium, 2005 |
|
| , , , , , , and , Transonics: A practical speech-to-speech translator for English-Farsi medical dialogues, in: Proceedings of the International Committee on Computational Linguistics and the Association for Computational Linguistics (COLING/ACL), pages 89-92, 2005 |
|
| and , Unsupervised speaker indexing using generic models (2005), in: IEEE Transactions on Speech and Audio Processing, 13:5(1004-1013) |
|
| , , , , , , , , , , , , , and , Virtual humans for non-team interaction training, in: Proceedings of the SIGdial Workshop, 2005 |
|
2004
| , and , A distributed speech recognition system in multi-user environments, in: Proceedings of InterSpeech, pages 2121-2124, 2004 |
|
| and , A multi-pass linear fold algorithm for sentence boundary detection using prosodic cues, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 525-528, 2004 |
|
| , and , A statistical approach to retrieval under user-dependent uncertainty in query-by-humming systems, in: Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR), pages 113-118, 2004 |
|
| and , A statistical discrimination measure for hidden Markov models based on divergence, in: Proceedings of InterSpeech, pages 657-660, 2004 |
|
| , and , A transcription scheme for languages employing the Arabic script motivated by speech processing applications, in: Proceedings of the International Conference on Computational Linguistics, 2004 |
|
| , and , Adaptive speaker identification with audiovisual cues for movie content analysis (2004), in: Pattern Recognition Letters, 25:7(777-791) |
|
| , , , , , , and , An acoustic study of emotions expressed in speech, in: Proceedings of InterSpeech, pages 2193-2196, 2004 |
|
| , , , and , An approach to real-time magnetic resonance imaging for speech production (2004), in: Journal of the Acoustical Society of America, 115:4(1771-1776) |
|
| , , , , , , , and , Analysis of emotion recognition using facial expressions, speech and multimodal information, in: Proceedings of the International Conference on Multimodal Interfaces, pages 205-211, 2004 |
|
| , , , , , and , Analyzing the interplay between spoken language and gestural cues in conversational child-machine interactions in pre/early literate age group, in: Proceedings of the InSTIL/ICALL Symposium, 2004 |
|
| , , and , Audio-based head motion synthesis for avatar-based telepresence systems, in: Proceedings of the ACM SIGMM Effective Telepresence Workshop (ETP), pages 24-30, ACM Press, 2004 |
|
| , , and , Automatic dynamic expression synthesis for speech animation, in: Proceedings of the IEEE Computer Animation and Social Agents (CASA), pages 267-274, IEEE Press, 2004 |
[URL] |
| , , , , and , Constructing emotional speech synthesizers with limited speech database, in: Proceedings of InterSpeech, pages 1185-1188, 2004 |
|
| , and , Content-based movie analysis and indexing based on audiovisual cues (2004), in: IEEE Transactions on Circuits and Systems for Video Technology, 14:8(1073-1085) |
|
| , and , Context dependent statistical augmentation of Persian transcripts, in: Proceedings of InterSpeech, pages 853-856, 2004 |
|
| , , , and , Creation of a doctor-patient dialogue corpus using standardized patients, in: Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2004 |
|
| , , , , , , and , Emotion recognition based on phoneme classes, in: Proceedings of InterSpeech, pages 889-892, 2004 |
|
| , and , Enhanced standard compliant distributed speech recognition (AURORA encoder) using rate allocation, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 485-488, 2004 |
|
| and , Measuring convergence in language model estimation using relative entropy, in: Proceedings of InterSpeech, pages 1057-1060, 2004 |
|
| , , and , Reference marking in children's computer-directed speech: An integrated analysis of discourse and gesture, in: Proceedings of InterSpeech, pages 1841-1844, 2004 |
|
| , and , Robust speech recognition over packet networks: An overview, in: Proceedings of InterSpeech, pages 621-624, 2004 |
|
| , and , Speaker identification using supra-segmental pitch pattern dynamics, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 89-92, 2004 |
|
| and , Speaker model quantization for unsupervised speaker indexing, in: Proceedings of InterSpeech, pages 1517-1520, 2004 |
|
| , and , Synthesizing expressive speech: Overview, challenges and open questions, in: Text to speech synthesis: New paradigms and advances, Prentice Hall, 2004 |
| , , , and , Tactical language detection and modeling of learner speech errors: The case of Arabic tactical language training for American English speakers, in: Proceedings of the InSTIL/ICALL Symposium, 2004 |
|
| , , , , , , and , Tactical language training system: An interim report, in: Proceedings of the Conference on Intelligent Tutoring Systems (ITS), pages 336-345, 2004 |
|
| , , , , and , Tactical language training system: Supporting the rapid acquisition of foreign language and cultural skills, in: Proceedings of the InSTIL/ICALL Symposium, 2004 |
|
| and , Text to speech synthesis: New paradigms and advances, Prentice Hall, 2004 |
| , , , , , , , , , , , , , and , The transonics spoken dialogue translator: An aid for English-Persian doctor-patient interviews, in: Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI) Fall Symposium, pages 97-103, 2004 |
|
| , , , , and , Using cognitive task analysis to facilitate collaboration in development of simulator to accelerate surgical training, in: Proceedings of the Annual Medicine Meets Virtual Reality (MMVR) Conference, pages 114-120, 2004 |
|
2003
| and , A method for on-line speaker indexing using generic reference models, in: Proceedings of InterSpeech, pages 2653-2656, 2003 |
|
| , and , A statistical multidimensional humming transcription using phone level hidden Markov models for query by humming systems, in: Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), pages 61-64, 2003 |
|
| and , A study of generic models for unsupervised on-line speaker indexing, in: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pages 423-428, 2003 |
|
| , , and , Acoustic analysis of preschool children's speech, in: Proceedings of the International Congresses of Phonetic Sciences (ICPhS), 2003 |
|
| , and , Acoustic correlates of user response to error in human-computer dialogues, in: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pages 215-220, 2003 |
|
| and , An empirical text transformation method for spontaneous speech synthesizers, in: Proceedings of InterSpeech, pages 1221–1224, 2003 |
|
| and , An information-theoretic analysis of developmental changes in speech, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages I-480-I-483, 2003 |
|
| , and , ASCII based transcription systems with the Arabic script: The case of Persian, in: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2003 |
|
| , and , Audiovisual-based adaptive speaker identification, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 565-568, 2003 |
|
| , , , and , Creating data resources for designing user-centric front-ends for query by humming systems, in: Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR), pages 475-483, 2003 |
|
| and , Emotion recognition using a data-driven fuzzy inference system, in: Proceedings of InterSpeech, pages 157-160, 2003 |
|
| , and , Improvements in English ASR for the Malach project using syllable-centric models, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding (ASRU), pages 129-134, 2003 |
|
| and , Language-adaptive Persian speech recognition, in: Proceedings of InterSpeech, 2003 |
|
| , and , Movie content analysis, indexing and skimming via multimodal information, in: Video Mining, pages 1-33, Kluwer Academic, 2003 |
|
| , and , Multidimensional humming transcription using a statistical approach for query by humming systems, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 385-388, 2003 |
|
| and , Robust recognition of children’s speech (2003), in: IEEE Transactions on Speech and Audio Processing, 11:6(603-616) |
|
| and , Split-lexicon based hierarchical recognition of speech using syllable and word level acoustic units, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 772-775, 2003 |
|
| , and , Towards optimal encoding for classification with applications to distributed speech recognition, in: Proceedings of InterSpeech, 2003 |
|
| , , , , , , , , , , , , and , Transonics: A speech to speech system for English-Persian interactions, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding (ASRU), 2003 |
|
| , and , Virtual microphones for multichannel audio resynthesis (2003), in: EURASIP Journal on Applied Signal Processing, 10:1(968-979) |
|
2002
| and , A confidence-score based unsupervised MAP adaptation for speech recognition, in: Proceedings of the Asilomar Conference on Signals, Systems and Computers, pages 222-226, 2002 |
|
| , and , A statistical approach to humming recognition, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages IV-4175, 2002 |
|
| , and , A syllable based approach for improved recognition of spoken names, in: Proceedings of the International Speech Communication Association (ISCA) Pronunciation Modeling and Lexicon Adaptation Workshop, pages 1-4, 2002 |
|
| and , A system for automatic recognition of pathological speech, in: Proceedings of the Asilomar Conference on Signals, Systems and Computers, 2002 |
|
| , and , An HMM-based approach to humming transcription, in: Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), pages 337-340, 2002 |
|
| , , , and , Analysis of user behavior under error conditions in spoken dialogs, in: Proceedings of InterSpeech, pages 2069-2072, 2002 |
|
| , and , Classifying emotions in human-machine spoken dialogs, in: Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), pages 737-740, 2002 |
|
| , and , Collaborative classification applications in sensor networks, in: Proceedings of the IEEE Sensor Array and Multichannel Signal Processing (SAM) Workshop, pages 370-374, 2002 |
|
| , and , Combining acoustic and language information for emotion recognition, in: Proceedings of InterSpeech, 2002 |
|
| , and , Comparison of dictionary-based approaches to automatic repeating melody extraction, in: Proceedings of Electronic Imaging (EI) Conference, pages 306-317, 2002 |
|
| and , Creating conversational interfaces for children (2002), in: IEEE Transactions on Speech and Audio Processing, 10:2(65-78) |
|
| and , Distribution detection and tracking in sensor networks, in: Proceedings of the Asilomar Conference on Signals, Systems and Computers, pages 1174-1178, 2002 |
|
| , and , Efficient multichannel audio resynthesis by subband-based spectral conversion, in: Proceedings of the European Signal Processing Conference (EUSIPCO), pages 413-416, 2002 |
|
| , and , Expressive speech synthesis using a concatenative synthesizer, in: Proceedings of InterSpeech, pages 1265-1268, 2002 |
|
| , and , Feature analysis for automatic detection of pathological speech, in: Proceedings of the IEEE Engineering in Medicine and Biology Society (EMBS) Meeting, pages 182-183, 2002 |
|
| , and , Gaussian mixture model based methods for virtual microphone signal synthesis, in: Proceedings of the Audio Engineering Society (AES) Convention, 2002 |
|
| , and , Identification of speakers in movie dialogs using audiovisual cues, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 2093-2096, 2002 |
|
| , , , and , Limited domain synthesis of expressive military speech for animated characters, in: Proceedings of the IEEE Speech Synthesis Workshop, 2002 |
|
| , and , Maximum likelihood constrained adaptation for multichannel audio synthesis, in: Proceedings of the Asilomar Conference on Signals, Systems and Computers, pages 227-232, 2002 |
|
| , and , Multiresolution spectral conversion for multichannel audio resynthesis, in: Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), pages 273- 276, 2002 |
|
| and , Refined speech segmentation for concatenative synthesis, in: Proceedings of InterSpeech, 2002 |
|
| and , Speaker change detection using a new weighted distance measure, in: Proceedings of InterSpeech, pages 16-20, 2002 |
|
| and , Spoken language synthesis: Experiments in synthesis of spontaneous monologues, in: Proceedings of the IEEE Speech Synthesis Workshop, pages 203-206, 2002 |
|
| , Towards modeling user behavior in human-machine interactions: Effect of errors and emotions, in: Proceedings of the ISLE Workshop on Multimodal Dialog Tagging, 2002 |
|
2001
| , and , A dictionary approach to repetitive pattern finding in music, in: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pages 281-284, 2001 |
|
| , , , and , Amount of information presented in a complex list: Effects on user performance, in: Proceedings of the Human Language Technology (HLT) Conference, pages 1-6, 2001 |
|
| , and , Automatic main melody extraction from MIDI files with a modified Lempel-Ziv Algorithm, in: Proceedings of the International Symposium on Intelligent Multimedia, Video and Speech Processing, pages 9-12, 2001 |
|
| , , and , Automatic movie index generation based on multimodal information, in: Proceedings of the International Symposium on The Convergence of Information Technologies and Communications (ITCom), pages 42-53, 2001 |
|
| , , , , , , , , , , , , , , , , , and , DARPA communicator dialog travel planning systems: The June 2000 data collection, in: Proceedings of InterSpeech, pages 1371-1374, 2001 |
|
| , and , Efficient scalable speech compression for scalable speech recognition, in: Proceedings of InterSpeech, pages 1845-1848, 2001 |
|
| , , , and , Just (all) the facts, ma'am, in: Proceedings of the ACM Conference on Computer-Human Interaction (CHI), pages 133-134, 2001 |
|
| , and , Music indexing with extracted main melody by using modified Lempel-Ziv algorithm, in: Proceedings of the International Symposium on The Convergence of Information Technologies and Communications (ITCom), pages 124-135, 2001 |
|
| , , , and , On the implementation of ASR algorithms for hand-held wireless mobile devices, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 17-20, 2001 |
|
| , , , and , Politeness and frustration language in child-machine interactions, in: Proceedings of InterSpeech, pages 2675-2678, 2001 |
|
| , and , Recognition of negative emotions from the speech signal, in: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pages 240-243, 2001 |
|
| , and , Use of model transformations for distributed speech recognition, in: Proceedings of the International Speech Communication Association (ISCA) Workshop on Adaptation Methods for Speech Recognition, pages 113-116, 2001 |
|
2000
| , , , , , , and , A spoken dialog system for conference/workshop services, in: Proceedings of InterSpeech, pages 736-739, 2000 |
| , , , and , Acoustic modeling of American English /r/ (2000), in: Journal of the Acoustical Society of America, 108:1(343-356) |
|
| , , , and , Automatic speech recognition for mobile communication devices, in: Proceedings of the IEEE Nordic Signal Processing Symposium (NORSIG), 2000 |
|
| , , , , , and , Effects of dialog initiative and multi-modal presentation strategies on large directory information access, in: Proceedings of InterSpeech, pages 636-639, 2000 |
|
| and , Noise source models for fricative consonants (2000), in: IEEE Transactions on Speech and Audio Processing, 8:2(328-344) |
|
| , , and , Phrasal signatures in articulation, chapter 5, pages 70-87, Cambridge University Press, 2000 |
|
| , , , , , , , , , , and , The AT&T-DARPA communicator mixed-initiative spoken dialog system, in: Proceedings of InterSpeech, pages 122-125, 2000 |
|
| , , , , , , and , Unifying conversational multimedia interfaces for accessing network services across communication devices, in: Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), pages 1-4, 2000 |
|
| and , Web-based monitoring, logging and reporting tools for multiservice, multimodal systems, in: Proceedings of InterSpeech, pages 1041-1044, 2000 |
|
1999
| and , Acoustic modeling of Tamil retroflex liquids, in: Proceedings of the International Congresses of Phonetic Sciences (ICPhS), pages 2097-2100, 1999 |
|
| , and , Acoustics of children’s speech: Developmental changes of temporal and spectral parameters (1999), in: Journal of the Acoustical Society of America, 105:3(1455-1468) |
|
| , and , Categorical understanding using statistical N-gram models, in: Proceedings of InterSpeech, pages 2027-2030, 1999 |
|
| , , and , Extending computer telephony and IP telephony standards for voice-enabled services in a multi-modal user interface environment, in: Proceedings of Interactive Dialogue in Multi-Modal Systems (IDS), pages 9-12, 1999 |
| , and , Geometry, kinematics, and acoustics of Tamil liquid consonants (1999), in: Journal of the Acoustical Society of America, 106:4(1993-2007) |
|
| , and , Multimodal systems for children: Building a prototype, in: Proceedings of InterSpeech, pages 1727-1730, 1999 |
|
| , , and , Speech production and perception models and their applications to synthesis, recognition, and coding, in: Speech Processing, Recognition, and Artificial Neural Networks, pages 138-161, Springer-Verlag, 1999 |
|
| , and , Spoken dialog systems: From theory to practice, in: Proceedings of the IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, 1999 |
1998
| , and , Language model adaptation for spoken language systems, in: Proceedings of InterSpeech, pages 2327-2330, 1998 |
|
| , and , Learning optimal dialogue strategies: A case study of a spoken dialogue agent for email, in: Proceedings of the International Committee on Computational Linguistics and the Association for Computational Linguistics (COLING/ACL), pages 1345-1351, 1998 |
|
| , , , and , Probing the relationship between qualitative and quantitative performance measures for telecommunication services, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 3769-3772, 1998 |
| and , Spoken dialog systems for children, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 197-200, 1998 |
|
| , , , , , , , , , , , and , VPQ: A spoken language interface to large scale directory information, in: Proceedings of InterSpeech, pages 2863-2867, 1998 |
|
1997
| , , and , Acoustic modelling of American English /r/, in: Proceedings of InterSpeech, pages 393-396, 1997 |
| , and , Analysis of children's speech: Duration, pitch and formants, in: Proceedings of InterSpeech, pages 473-476, 1997 |
| , and , Automatic speech recognition for children, in: Proceedings of InterSpeech, pages 2371-2374, 1997 |
| , and , Database management and analysis for spoken dialog systems: Methodology and tools, in: Proceedings of InterSpeech, pages 2199-2202, 1997 |
| , , and , Evaluating spoken dialog systems for telecommunication services, in: Proceedings of InterSpeech, pages 2203-2206, 1997 |
| , and , New results in vowel production: MRI, EPG, and acoustic data, in: Proceedings of InterSpeech, pages 1007-1010, 1997 |
| and , Novel filler acoustic models for connected digit recognition, in: Proceedings of InterSpeech, pages 283-286, 1997 |
| , , and , Phrasal boundaries and articulatory timing, in: Proceedings of the Meeting on Laboratory Phonology, 1997 |
| , , , and , The relationship between qualitative and quantitative service performance measures: Results from universal voiceline trial, in: Proceedings of the Service Infrastructure Performance Symposium, pages 162-169, 1997 |
| , and , Toward articulatory-acoustic models for liquid consonants based on MRI and EPG data. Part I: The laterals (1997), in: Journal of the Acoustical Society of America, 101:2(1064-1077) |
[URL] |
| , and , Toward articulatory-acoustic models for liquid consonants based on MRI and EPG data. Part II: The rhotics (1997), in: Journal of the Acoustical Society of America, 101:2(1078-1089) |
[URL] |
| , and , Unsupervised HMM adaptation based on speech-silence discrimination, in: Proceedings of InterSpeech, pages 2055-2088, 1997 |
1996
| , and , From MRI and acoustic data to articulatory synthesis: A case study of the lateral approximants in American English, in: Proceedings of InterSpeech, pages 793-796, 1996 |
|
| and , Imaging applications in speech production research, in: Proceedings of the Society of Photographic Instrumentation Engineers (SPIE) Medical Imaging, pages 120-131, 1996 |
| , , , and , Liquids in Tamil, in: Proceedings of InterSpeech, pages 797-800, 1996 |
|
| and , Parametric hybrid source models for voiced and voiceless fricative consonants, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 377-380, 1996 |
|
| , and , Prosodic boundary effects in Tamil: An articulatory study, in: Proceedings of the Annual Meeting of the Linguistic Society of America, 1996 |
1995
| and , A nonlinear dynamical systems analysis of fricative consonants (1995), in: Journal of the Acoustical Society of America, 97:4(2511-2524) |
| , and , An articulatory study of fricative consonants using magnetic resonance imaging (1995), in: Journal of the Acoustical Society of America, 98:3(1325-1347) |
|
| , and , An articulatory study of liquid approximants in American English, in: Proceedings of the International Congress of Phonetic Sciences (ICPhS), pages 576-579, 1995 |
| , , and , Speech production and perception models and their applications to synthesis, recognition, and coding, in: Proceedings of the International Symposium on Signals, Systems, and Electronics (ISSSE), pages 367-372, 1995 |
|
1994
| , and , An MRI study of fricative consonants, in: Proceedings of InterSpeech, pages 627-630, 1994 |
| , , and , Fast and efficient motion compensation techniques using subband analysis, in: Proceedings of the IEEE International Conference on Image Processing (ICIP), pages 265-269, 1994 |
1993
| and , Loading effects on Indian musical drums: An acoustic analysis, in: Proceedings of the Material Research Society, 1993 |
| and , Strange attractors and chaotic dynamics in the production of voiced and voiceless fricatives, in: Proceedings of InterSpeech, pages 77-80, 1993 |
1991
| and , Nonlinear filtering and smoothing for noisy alternating renewal process signals, in: Proceedings of the IEEE American Control Conference, pages 225-228, 1991 |
