# nlp_course

**Repository Path**: SapientialM/nlp_course

## Basic Information

- **Project Name**: nlp_course
- **Description**: YSDA NLP入门教程，copy from https://github.com/yandexdataschool/nlp_course
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 2
- **Forks**: 0
- **Created**: 2022-01-09
- **Last Updated**: 2025-02-25

## Categories & Tags

**Categories**: Uncategorized

**Tags**: Nlp

## README

# YSDA Natural Language Processing course [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/yandexdataschool/nlp_course/master)
* Lecture and seminar materials for each week are in ./week* folders
* YSDA homework deadlines are listed in [Anytask course page](https://anytask.org/course/384).
* Any technical issues, ideas, bugs in course materials, contribution ideas - add an [issue](https://github.com/yandexdataschool/nlp_course/issues)
* Installing libraries and troubleshooting: [this thread](https://github.com/yandexdataschool/nlp_course/issues/1).


# Syllabus
- [__week01__](./week01_embeddings) __Embeddings__
  - Lecture: Word embeddings. Distributional semantics, LSA, Word2Vec, GloVe. Why and when we need them.
  - Seminar: Playing with word and sentence embeddings.

- [__week02__](./week02_classification) __Text classification__
  - Lecture: Text classification. Classical approaches for text representation: BOW, TF-IDF. Neural approaches: embeddings, convolutions, RNNs
  - Seminar: Salary prediction with convolutional neural networks; explaining network predictions.

- [__week03__](./week03_lm) __Language Models__
  - Lecture: Language models: N-gram and neural approaches; visualizing trained models
  - Seminar: Generating ArXiv papers with language models
  
- [__week04__](./week04_seq2seq) __Seq2seq/Attention__
  - Lecture: Seq2seq: encoder-decoder framework. Attention: Bahdanau model. Self-attention, Transformer. Pointer networks. Attention for analysis.
  - Seminar: Machine translation of hotel and hostel descriptions
  
- [__week05__](./week05_structured) __Structured Learning__
  - Lecture: Structured Learning: structured perceptron, structured prediction, dynamic oracles, RL basics.
  - Seminar: POS tagging

- [__week06__](./week06_em) __Expectation-Maximization__
  - Lecture: Expectation-Maximization and Word Alignment Models
  - Seminar: Implementing expectation maximizaiton

- [__week07__](./week07_mt) __Machine translation__
  - Lecture: Machine Translation: a review of the key ideas from PBMT, the application specific ideas that have developed in NMT over the past 3 years and some of the open problems in this area.
  - Seminar: presentations by students
  
- [__week08__](./week08_multitask) __Transfer learning and Multi-task learning__
  - Lecture: What and why does a network learn: "model" is never just "model"! Transfer learning in NLP. Multi-task learning in NLP. How to understand, what kind of information the model representations contain.
  - Seminar: Improving named entity recognition by learning jointly with other tasks

- [__week09__](./week09_da) __Domain Adaptation__
  - Lecture: General theory. Instance weighting. Proxy-labels methods. Feature matching methods. Distillation-like methods.
  - Seminar: Adapting general machine translation model to a specific domain.
- [__week10__](./week10_dialogue) __Dialogue Systems__
  - Lecture: Task-oriented vs general conversation systems. Overview of a framework for task-oriented systems. General conversation: retrieval and generative approaches. Generative models for general conversation. Retrieval-based models for general conversation.
  - Seminar: Simple retrieval-based question answering
  
- [__week11__](./week11_gan) __Adversarial learning & Latent Variables for NLP__
  - Lecture: generative models recap, generative adversarial networks, variational autoencoders and why should you care about them.
  - Seminar: semi-supervised dictionary learning with adversarial networks
  
- [__week12__](./week12_summarization) __Text Summarization__
  - Lecture: Text summarization methods. Extractive vs abstractive. A piece of extractive text summarization. Abstractive text summarization.


# Contributors & course staff
Course materials and teaching performed by
- [Elena Voita](https://lena-voita.github.io) - course admin, lectures, seminars, homeworks
- [Boris Kovarsky](https://github.com/kovarsky) - lectures, seminars, homeworks
- [David Talbot](https://github.com/drt7) - lectures, seminars, homeworks
- [Sergey Gubanov](https://github.com/esgv) - lectures, seminars, homeworks
- [Just Heuristic](https://github.com/justheuristic) - lectures, seminars, homeworks