WWW pages of 3rd European Master School on Language and Speech

Recognising Nominalisations

Yuk On Kong
(University of Edinburgh)

Nominalisation refers to the process of forming a noun from some other word-class. Nominalisation from verbs is very productive in English, usually by the attachment of a suffix. Nominalisation recognition is the identification of nominalisations and their associated verbs (e.g. "statement" and "state"). It is important for a number of NLP tasks, such as machine translation and information retrieval.

Since nominalisation is a productive morphological phenomenon, it is impractical to exhaustively list all acceptable nominalised forms. New words keep being added to any particular language, including new nominalisations. Several NLP techniques have been developed to analyse morphology in general. Current algorithms require the building of rules for morphological structures manually, while recent work has focused on machine-learning approaches to induce morphological structures using large corpora in order to avoid the labour-intensive process of rule-building. Yet another approach is the knowledge-free induction of inflectional morphologies (Schone and Jurafsky 2001).

The principal goal of this project is to develop a system which can recognise nominalisations, together with the verbs from which they are derived. Schone and Jurafsky (2001) have performed work for acquiring cognates and morphological variants, which this project will build on. A number of techniques will be explored, on the basis of morphology and the syntactic context in which nominalisation appears. I will use the BNC corpus, with the CELEX lexicon as the gold standard for evaluation.