|
Outline
Any content on Natural Language Processing (NLP) will include models, formalisms
and algorithms that can be used for development of systems for processing text
in terms of both analysis and generation. Techniques include traditional
grammar-based and the more recent statistical/corpus-based methods.
Topics
1 Introduction
- Applications of NLP techniques (MT, grammar checkers, dictation, document
generation, NL interfaces)
- The different analysis levels used for NLP (morpho-lexical, syntactic,
semantic, pragmatic)
- markup (TEI, UNICODE)
- Finite state automata
- Recursive and augmented transition networks
2 Lexical level
- Error-tolerant lexical processing (spelling error correction)
- Transducers for the design of morphologic analyzers
- Features
- Towards syntax: Part-of-speech tagging (Brill, HMM)
- Efficient representations for linguistic resources (lexica, grammars,...):
tries and finite-state automata
3 Syntactic level
- Grammars (e.g. Formal/Chomsky hierarchy, DCGs, systemic, case,
unification, stochastic)
- Parsing (top-down, bottom-up, chart (Earley algorithm), CYK algorithm)
- Automated estimation of probabilistic model parameters (inside-outside
algorithm)
- Data Oriented Parsing
4 Semantic level
- Logical forms
- Ambiguity resolution
- Semantic networks and parsers
- Procedural semantics
- Montague semantics
- Vector Space approaches
- Distributional Semantics
5 Pragmatic level
- Knowledge representation
- Reasoning
- Plan/goal recognition
- speech acts/intentions
- belief models
- discourse
- reference
6 Natural language generation
- content determination
- sentence planning
- surface realisation
7 Other approaches
- statistical/corpus-based NLP
- connectionist NLP
Prerequisites
The course is designed to be self-sufficient. However, some previous experience
with probabilities and programming concepts such as
abstract data type or computational complexity could be helpful
for quick understanding of the formal parts.
Bibliography
In paper form
- James Allen, Natural Language Understanding, Pub.
Benjamin/Cummings, 2nd edition, 1995.
- Gazdar, G. and Mellish, C.
Natural language processing in Prolog, Addison-Wesley, 1989.
- Eugene Charniak, Statistical Language Learning, MIT Press.
- Harry Bunt and Masaru Tomita (eds.), Recent Advances in Parsing
Technology, Kluwer, 1996.
- Emmanuel Roche and Yves Shabes (eds.), Finite State Language
Processing, MIT Press, 1997.
- S. Young and G. Bloothooft, Corpus-based Methods in Language and
Speech Processing, Kluwer, 1997.
-
Krenn &
Samuelsson compendium on statistical approaches in computational linguistics:
(Warning! this downloads a PostScript file.)
On the web
|Overview| |Course Content| |Members| |Participate| |Search| |Questions| |School|
|Theoretical Linguistics| |Natural Language Processing| |Phonetics and Phonology| |Cognitive Models for Speech Language Processing| |Speech Signal Processing| |Pattern Recognition| |Language Engineering Applications|
|