Welcome to the home of the SAIL Lab's transcribers. This page is meant to make available transcription resources and data and to detail our progress, achievements, and research interests.
Currently, we are two undergrads and one grad student (Daylen Riggs, Nathan Go, and Abe Kazemzadeh. We comprise a broad spectrum of backgrounds and research interests focused on the task of phonetic transcription of children's speech.
Currently, for broad phonetic transcriptions, we are using "transcriber.tcl". It is a simple program that will iterate through a directory of speech recordings and prompt for a transcription in a text box. With the use of "control"+p to replay the clip and "enter" to submit a transcription, the transcribing can be done quickly and allow maximum attention to be given listening and transcribing.
#!/bin/sh
# the next
line restarts using wish \
exec wish "$0" "$@"
set
transcriber nobody ;# set your name, Transcriber
set
directory test ;#the location of the audio and transcription files
Change "nobody" to your name and "test" to the name of the folder where your data is.
To use, right click on transcriber.tcl and select "Open with..." and "choose program" and browse to find tclkit.exe. You'll see the following window:
The count refers to the number of files transcribed since the program was started. The # of files per minute is an average gauge of speed. The play and submit buttons can be used with the mouse, but it is faster to use control-p and return. Please use quit instead of clicking the top right corner "x". For each wav file, the program will save a transcription file in the same directory and for each transcribing session, a log file containing all the transcriptions will be saved to a directory called "logs". If you make a mistake while transcribing, quit the program, delete the most resent transcription file (sort the files in the data folder by date).
Wavesurfer is a good all-purpose sound playback/recording/analysis tool. We will be using this for more detailed transcriptions where the segments are mapped to their beginning and end points. For sentences, these will generally be words, and for words, the segments will be phones.
Wavesurfer is free. It can be downloaded from The Royal Institute of Technology in Stockholm. Its very easy to install. Just save the executable program somewhere convenient (Desktop, c:\Program Files, etc) and click to run it.
See Daylen's sentence transcriptions for an eg.
Speech data can be found at this link. Right now we're doing sentences and the kindergarten wordlist (KWlist)