The Centre for Speech Technology Research, The university of Edinburgh

Unisyn Lexicon Release, version 1.3

The Unisyn lexicon is a master lexicon transcribed in keysymbols, a kind of metaphoneme which allows the encoding of multiple accents of English.

The lexicon is accompanied by a number of perl scripts which transform the base lexicon via phonological and allophonic rules, and other symbol changes, to produce output transcriptions in different accents. The rules can be applied to the whole lexicon, to produce an accent-specific lexicon, or to running text. Output can be displayed in keysymbols, SAMPA, or IPA.

The system uses a geographically-based accent hierarchy, with a tree structure describing countries, regions, towns and speakers; this hierarchy is used to specify the application of rules and other pronunciation features.

The lexicon system is customisable, and the documentation explains how to modify output by swtiching rules on and off, adding new rules or editing existing ones. The user can also add new nodes in the accent hierarchy (new accents or new speakers within an accent), or add new symbols.

A number of UK, US, Australian and New Zealand accents are included in the release.

The scripts run under unix, or Windows 98 (DOS), and use perl 5.6.0.


The latest release version is 1.3. At the current time, the documentation has not been updated since version 1.1.

The Unisyn lexicon is currently released under a license for non-commercial use only. To download the lexicon, first read and accept the license.

Contact for more details.