Skip to content


IEMOCAP Database
The Interactive Emotional Dyadic Motion Capture (IEMOCAP) database is an acted, multimodal and multispeaker database, recently collected at SAIL lab at USC. It contains approximately 12 hours of audiovisual data, including video, speech, motion capture of face, text transcriptions. (Read more…)
MICA Text Corpus
The MICA Text Corpus is now available for download. (Read more…)
EMA Database
The Electromagnetic Articulography (EMA) database contains a total of 680 utterances spoken in four different target emotions, such as anger, happiness, sadness and neutrality. (Read more…)
MRI-TIMIT Database
MRI-TIMIT is a large-scale database of synchronized audio and real-time magnetic resonance imaging (rtMRI) data for speech research.The database currently consists of midsagittal upper airway MRI data and phonetically-transcribed companion audio, acquired from two male and two female speakers of American English. (Read more…)
EMO-MRI Database
USC-EMO-MRI is an emotional speech production database which includes real-time magnetic resonance imaging data with synchronized speech audio from five male and five female actors. (Read more…)
USC-TIMIT Database
USC-TIMIT is a database of speech production data under ongoing development, which currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English, and electromagnetic articulography data from three of these speakers. (Read more…)
CreativeIT Database
The CreativeIT database is an acted and multimodal database of dyadic theatrical improvisations. It contains 8 sessions of audiovisual data, including video, speech, and full-body motion capture data. (Read more…)
VAM Database
The Vera am Mittag (VAM) database is a German audio-visual speech database recorded from a talk-show on TV. Its main corpus contains almost 1,000 labelled audio samples of spontaneous,unscripted emotional expressions. (Read more…)
The Tracking IndividuaL pErformance with Sensors 2018 (TILES-2018) dataset contains physiologic, behavioral, and survey data from over 212 participants collected along 10 weeks. The participants worked in a highly-demanding work environment, and wore sensors and answered daily surveys throughout this time period. (Read more…)
75-Speaker Speech MRI Database
This dataset offers a unique corpus of 1) 2D sagittal-view real-time MRI including raw and reconstructed data, along with synchronized audio, 2) 3D volumetric MRI during sustained speech sounds, and 3) high-resolution static T2-weighted MRI, from 75 speakers producing a variety of linguistically motivated speech tasks. (Read more…)