- Authors: Alan Wrench, Queen Margaret University College
- Funded by: Engineering and Physical Sciences Research Council:
- When created: November 1999
- Availability: English speakers available here free for non-commercial use and may be distributed on CDROM for a fee.
- Purpose: Phonetically balanced dataset for training an automatic speech recognition system
- Description: Overview
- Microphone 16kHz sample rate (audio-technica ATM10a)
- Laryngograph 16kHz sample rate
- Electromagnetic Articulograph 500Hz sample rate (Carstens 10
- upper incisor
- lower incisor
- upper lip
- lower lip
- tongue tip
- tongue blade
- tongue dorsum
- EPG 200Hz sample rate
- SVHS video of front view of mouth area. (Available by special request)
- A set of 460 sentences designed to include the main connected speech processes in English (eg. assimilations, weak forms ..).
- Subjects: 2 speakers, 1 male and 1 female are currently available but another 38 are planned to be completed by May 2001. The subjects have a variety of accents of English.
- All recordings made in the same sound damped studio at the Edinburgh Speech Production Facility. All data were recorded direct to computer and carefully synchronised.
- Languages: English
- Platforms: The data files have headers which retain byte order information.
- Media: Internet (FTP) and possibly CDROM
- SVHS video available by special request.
- Audio and Laryngograph are stored with 1024 byte ascii NIST headers. EPG remains in raw binary (8 bytes per sample). EMA data is stored in Edinburgh Speech Tools Trackfile format consisting of a variable length ascii header and a 4 byte float representation per channel. The first channel is a time value in seconds the second value is always 1 (used to indicate if the sample is present or not) subsequent 5 values are coil 1-5 x-values followed by coil 1-5 y-values followed by coil 6-10 x-values and finally coils 6-10 y-values.
- Size:~200kBytes per speaker
- Edinburgh Speech tools is free and contains routines ch_wave and ch_track which can be used to convert the waveform and EMA files into other formats such as ESPS waves format, HTK or raw binary as well as other routines such as pitch tracking.
- MATLAB - a set of macros to read and write the stored data formats are available along with supplimentary routines for the EMATOOLS set.
- Software Documentation:
- Edinburgh Speech Tools
- Software Source/Executables:
- See above.
- Contact: A. Wrench
- Dept. Speech and language Sciences, Queen Margaret University College, Clerwood Terrace, Edinburgh. EH12 8TS.
- +44 131 317 3692
- +44 131 317 3689