University of Southern California

MRI-TIMIT: a Multimodal Real-Time MRI Articulatory Corpus

MRI-TIMIT is a large-scale database of synchronized audio and real-time magnetic resonance imaging (rtMRI) data for speech research. The database currently consists of midsagittal upper airway MRI data and phonetically-transcribed companion audio, acquired from two male and two female speakers of American English.

MRI-TIMIT is now publicly available for research purposes as part of the broader USC-TIMIT database.


ID Gender Age Birthplace
M1 Male 29 Buffalo, NY
M2 Male 33 Ann Arbor, MI
W1 Female 23 Commack, NY
W2 Female 32 Westfield, IA


The same 460-sentence phonetically balanced dataset used in the MOCHA-TIMIT corpus (Wrench 1999) was elicited from each subject.

Articulatory Data

Subject's upper airways were imaged in the midsagittal plane using a custom real-time MRI protocol (Bresch et al. 2008). MRI data were acquired at Los Angeles County Hospital on a Signa Excite HD 1.5T scanner using a 13-interleaf spiral gradient echo pulse sequence (Tr = 6.164 msec, FOV = 200x200 mm, flip angle = 15deg). Image resolution: 68x68 pixels. Video rate = 23.18 frames/sec.

Acoustic Data

Audio was simultaneously recorded at a sampling frequency of 20kHz inside the MRI scanner, using a custom fiber-optic microphone noise-cancelling system (Bresch et al. 2006) synchronized with the video signal through the scanner master clock. Time-aligned phonetic transcriptions of all utterances in the database were generated from the audio recordings, using the freely available tool SailAlign (Katsamanis et al. 2011).

Example Utterances