home page
home page
Theoretical Linguistics
Natural language processing
Phonetics Phonology
Cognitive Models
Speech Signal Processing
Pattern Recognition
Language Engineering
Overview
Course Content
Members
Participate
Search
Question
Summer school
Archive

Outline

Any content on Natural Language Processing (NLP) will include models, formalisms and algorithms that can be used for development of systems for processing text in terms of both analysis and generation. Techniques include traditional grammar-based and the more recent statistical/corpus-based methods.

Topics

1 Introduction

  • Applications of NLP techniques (MT, grammar checkers, dictation, document generation, NL interfaces)
  • The different analysis levels used for NLP (morpho-lexical, syntactic, semantic, pragmatic)
  • markup (TEI, UNICODE)
  • Finite state automata
  • Recursive and augmented transition networks

2 Lexical level

  • Error-tolerant lexical processing (spelling error correction)
  • Transducers for the design of morphologic analyzers
  • Features
  • Towards syntax: Part-of-speech tagging (Brill, HMM)
  • Efficient representations for linguistic resources (lexica, grammars,...): tries and finite-state automata

3 Syntactic level

  • Grammars (e.g. Formal/Chomsky hierarchy, DCGs, systemic, case, unification, stochastic)
  • Parsing (top-down, bottom-up, chart (Earley algorithm), CYK algorithm)
  • Automated estimation of probabilistic model parameters (inside-outside algorithm)
  • Data Oriented Parsing

4 Semantic level

  • Logical forms
  • Ambiguity resolution
  • Semantic networks and parsers
  • Procedural semantics
  • Montague semantics
  • Vector Space approaches
  • Distributional Semantics

5 Pragmatic level

  • Knowledge representation
  • Reasoning
  • Plan/goal recognition
  • speech acts/intentions
  • belief models
  • discourse
  • reference

6 Natural language generation

  • content determination
  • sentence planning
  • surface realisation

7 Other approaches

  • statistical/corpus-based NLP
  • connectionist NLP

Prerequisites

The course is designed to be self-sufficient. However, some previous experience with probabilities and programming concepts such as abstract data type or computational complexity could be helpful for quick understanding of the formal parts.


Bibliography

In paper form

  • James Allen, Natural Language Understanding, Pub. Benjamin/Cummings, 2nd edition, 1995.
  • Gazdar, G. and Mellish, C. Natural language processing in Prolog, Addison-Wesley, 1989.
  • Eugene Charniak, Statistical Language Learning, MIT Press.
  • Harry Bunt and Masaru Tomita (eds.), Recent Advances in Parsing Technology, Kluwer, 1996.
  • Emmanuel Roche and Yves Shabes (eds.), Finite State Language Processing, MIT Press, 1997.
  • S. Young and G. Bloothooft, Corpus-based Methods in Language and Speech Processing, Kluwer, 1997.
  • Krenn & Samuelsson compendium on statistical approaches in computational linguistics:
    (Warning! this downloads a PostScript file.)

On the web

|Overview| |Course Content| |Members| |Participate| |Search| |Questions| |School|
|Theoretical Linguistics| |Natural Language Processing| |Phonetics and Phonology| |Cognitive Models for Speech Language Processing| |Speech Signal Processing| |Pattern Recognition| |Language Engineering Applications|