Go to the first, previous, next, last section, table of contents.


22 Other synthesis methods

Festival supports a number of other synthesis systems

22.1 LPC diphone synthesizer

A very simple, and very efficient LPC diphone synthesizer using the "donovan" diphones is also supported. This synthesis method is primarily the work of Steve Isard and later Alistair Conkie. The synthesis quality is not as good as the residual excited LPC diphone synthesizer but has the advantage of being much smaller. The donovan diphone database is under 800k.

The diphones are loaded through the Donovan_Init function which takes the name of the dictionary file and the diphone file as arguments, see the following for details

lib/voices/english/don_diphone/festvox/don_diphone.scm

22.2 MBROLA

As an example of how Festival may use a completely external synthesis method we support the free system MBROLA. MBROLA is both a diphone synthesis technique and an actual system that constructs waveforms from segment, duration and F0 target information. For details see the MBROLA home page at http://tcts.fpms.ac.be/synthesis/mbrola.html. MBROLA already supports a number of diphone sets including French, Spanish, German and Romanian.

Festival support for MBROLA is in the file `lib/mbrola.scm'. It is all in Scheme. The function MBROLA_Synth is called when parameter Synth_Method is MBROLA. The function simply saves the segment, duration and target information from the utterance, calls the external `mbrola' program with the selected diphone database, and reloads the generated waveform back into the utterance.

An MBROLA-ized version of the Roger diphoneset is available from the MBROLA site. The simple Festival end is distributed as part of the system in `festvox_en1.tar.gz'. The following variables are used by the process

mbrola_progname
the pathname of the mbrola executable.
mbrola_database
the name of the database to use. This variable is switched between different speakers.

22.3 Synthesizers in development

In addition to the above synthesizers Festival also supports CSTR's older PSOLA synthesizer written by Paul Taylor. But as the newer diphone synthesizer produces similar quality output and is a newer (and hence a cleaner) implementation further development of the older module is unlikely.

An experimental unit seleciton synthesis module is included in `modules/clunits/' it is an implementation of black97c. It is included for people wishing to continue reserach in the area rather than as a fully usable waveform synthesis engine. Although it sometimes gives excellent results it also sometimes gives amazingly bad ones too. We included this as an example of one possible framework for selection-based synthesis.

As one of our funded projects is to specifically develop new selection based synthesis algorithms we expect to include more models within later versions of the system.

Also, now that Festival has been released other groups are working on new synthesis techniques in the system. Many of these will become available and where possible we will give pointers from the Festival home page to them. Particularly there is an alternative residual excited LPC module implemented at the Center for Spoken Language Understanding (CSLU) at the Oregon Graduate Institute (OGI).


Go to the first, previous, next, last section, table of contents.