Articulatory Synthesis of French Connected Speech from EMA Data

This page contains audio files related to the paper: Asterios Toutios, Shrikanth Narayanan, Articulatory Synthesis of French Connected Speech from EMA Data, submitted to Interspeech 2013.

Abstact: This paper reports an experiment in synthesizing French connected speech using Maeda's digital simulation of the vocal-tract system. The dynamics of the vocal-tract shape are estimated from the dynamics of Electromagnetic Articulograph (EMA) sensors via Maeda's geometrical articulatory model. Time-varying characteristics of the glottis and the velopharyngeal port are set using empirical rules, while the fundamental frequency pattern is copied from the concurrently recorded audio signal. A subjective experiment was performed online to assess the perceived intelligibility and naturalness of the synthesized speech. Results indicate that a properly driven simulation of the vocal has the potential to provide a scientifically grounded alternative to the development of text-to-speech synthesis systems.

The two following audio files comprise recorded (concurrently with EMA - note that there is noise in the recording) and synthesized (with the method presented in the paper) versions of the ten Strasbourg Sentences.

Recorded sentences

Synthesized sentences