Skip to Main Content

INF 617 Natural Language Processing: Reading list

INF 617 Natural Language Processing

Department:  PhD in Computer Science 

Module Description: The main aim of the module is to present the newest developments in the area of natural language processing (NLP) using algorithms and techniques of machine learning (ML). The majority of human knowledge is currently stored in the form of unstructured text. Abstracts, reviews, descriptions, posts, emails, tweets, all create a huge corpus of data which cannot be analyzed manually. Such textual corpora exist in almost all domains of science and technology. Computer methods for text analysis are collectively known as NLP. In the recent years we are witnessing a true revolution in NLP due to the development of machine learning methods designed specifically to tackle NLP challenges. During the lecture the students will learn basic NLP methods (such as tokenization, lemmatization, stemming), basic representation methods (such as one-hot encoding, TF-IDF), as well as corpus-based techniques (such as word and sentence vectors, transformer language models). We will discuss methods and recent directions for researches in sentiment and emotion analysis in text, named entity recognition, machine translation, sequence to sequence learning, and among others.  

Module texts

Indicative key readings

  • Clark, A., Fox, C. and Lappin, S. (eds.). (2012). The handbook of computational linguistics and natural language processing (Vol. 118). John Wiley & Sons.

  • Vajjala, S., Majumder, B., Gupta, A. and Surana, H., 2020. Practical natural language processing: a comprehensive guide to building real-world NLP systems. O'Reilly Media.

Recommended readings

Ask a Librarian for help to find and evaluate resources