Overview

SailAlign is an open-source software toolkit for robust long speech-text alignment implementing an adaptive, iterative speech recognition and text alignment scheme that allows for the processing of very long (and possibly noisy) audio and is robust to transcription errors. It is mainly written as a perl library but its functionality also depends on freely available software, namely HTK, srilm and sclite.

Author

SailAlign's author is Nassos Katsamanis.

Architecture

A few words about how the toolkit is organized.

Usage

Detailed usage examples are included in the distribution. You may also want to download a tutorial explaining the main usage scenario.

Installing

You may find detailed installations instructions in the README file included in the distribution.

Dependencies

SailAlign does not implement its own speech recognition engine and language modeling algorithms. Instead, I have built interfaces to external freely-available software. Currently, interfaces to the following engines are available:

  • HTK which is a commonly used speech recognition engine.
  • Srilm is a toolkit written in C++ which provides various methods for language modeling.
Apart from HTK, precompiled versions of the prerequisite binaries are included with the distribution.

Downloads

To obtain SailAlign, please contact Nassos Katsamanis by email.

Publications

  • A. Katsamanis, M. Black, P. Georgiou, L. Goldstein and S. Narayanan,
    SailAlign: Robust long speech-text alignment,
    in Proc. of Workshop on New Tools and Methods for Very-Large Scale Phonetics Research, Jan. 2011.
    [pdf][ppt]
If you use SailAlign in your research, please cite this paper, which is the most up-to-date reference to SailAlign's functionality.

Related Work

  • Automatic alignment of captions in youtube.
  • Kishore Prahallad, Alan W. Black, Segmentation of Monologues in Audio Books for Building Synthetic Voices,
    accepted for publication in IEEE Transactions on Audio, Speech and Language Processing.
  • Release History

    The first open-source version of the toolkit was distributed in Jan. 2011.

    Licensing

    SailAlign is Copyright © 2011 by Nassos Katsamanis. SailAlign is distributed under the GNU General Public License (GPL). If you are interested in alternative licensing options (i.e. Dual Licensing) or consulting help, please contact Nassos Katsamanis by email.

    The srilm binaries which are included in the distribution for convenience are licensed under SRILM Research Community License.

    Acknowledgments

    Financial support for this software has been partly provided by NSF. This is gratefully acknowledged.

    Last Update: Jan 28, 2011
    For comments or questions contact Nassos Katsamanis.