home page
home page

Back to menu

Euromasters summer school 2005

Tutorial 4: Introduction to NLTK and Python

Trevor Cohn, University of Edinburgh, UK, assisted by Yves Peirsman, University of Edinburgh, UK

This tutorial has an optional introductory session followed by the main tutorial. Students have the choice of taking both sessions (1.5 days), or just the main session (1 day), depending on their computing skills.

Introductory session (half day)

This session is intended for students with limited programming experience and is an introduction to the powerful, high-level language Python.

Python is rapidly taking over from languages like Perl for rapid development of many types of systems, including those for speech and language processing. Its syntax is far clearer than Perl, yet it is more powerful.

A huge range of extension modules are available for Python, including the toolkit NLTK for natural language processing (NLP).

Main tutorial (one day)

The Natural Language Toolkit is a set of libraries and tools written in Python. It provides classes and algorithms for many fundamental operations in NLP and can display linguistic structures uing a graphical interface.

This tutorial will give an introduction to this excellent toolkit, which students may find many uses for in their own work.


Chart Parser

Bottom-Up Parser

Top-Down Parser

Plotting Tool
Screenshots: click on an image to enlarge

Back to registration page


Location of tutorial files

On the Informatics linux machines ("DICE" machines):

export NLTK_CORPORA=/usr/share/nltk-data

python2.3 /group/cstr/projects/euromasters/tutorial4/nltk_course/demo_drawsr.py

for example.

Handouts
Day 1 is for tutorial 4a. Day 2 is for tutorial 4b.