# nlp_course **Repository Path**: SapientialM/nlp_course ## Basic Information - **Project Name**: nlp_course - **Description**: YSDA NLP入门教程,copy from https://github.com/yandexdataschool/nlp_course - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 0 - **Created**: 2022-01-09 - **Last Updated**: 2025-02-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: Nlp ## README # YSDA Natural Language Processing course [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/yandexdataschool/nlp_course/master) * Lecture and seminar materials for each week are in ./week* folders * YSDA homework deadlines are listed in [Anytask course page](https://anytask.org/course/384). * Any technical issues, ideas, bugs in course materials, contribution ideas - add an [issue](https://github.com/yandexdataschool/nlp_course/issues) * Installing libraries and troubleshooting: [this thread](https://github.com/yandexdataschool/nlp_course/issues/1). # Syllabus - [__week01__](./week01_embeddings) __Embeddings__ - Lecture: Word embeddings. Distributional semantics, LSA, Word2Vec, GloVe. Why and when we need them. - Seminar: Playing with word and sentence embeddings. - [__week02__](./week02_classification) __Text classification__ - Lecture: Text classification. Classical approaches for text representation: BOW, TF-IDF. Neural approaches: embeddings, convolutions, RNNs - Seminar: Salary prediction with convolutional neural networks; explaining network predictions. - [__week03__](./week03_lm) __Language Models__ - Lecture: Language models: N-gram and neural approaches; visualizing trained models - Seminar: Generating ArXiv papers with language models - [__week04__](./week04_seq2seq) __Seq2seq/Attention__ - Lecture: Seq2seq: encoder-decoder framework. Attention: Bahdanau model. Self-attention, Transformer. Pointer networks. Attention for analysis. - Seminar: Machine translation of hotel and hostel descriptions - [__week05__](./week05_structured) __Structured Learning__ - Lecture: Structured Learning: structured perceptron, structured prediction, dynamic oracles, RL basics. - Seminar: POS tagging - [__week06__](./week06_em) __Expectation-Maximization__ - Lecture: Expectation-Maximization and Word Alignment Models - Seminar: Implementing expectation maximizaiton - [__week07__](./week07_mt) __Machine translation__ - Lecture: Machine Translation: a review of the key ideas from PBMT, the application specific ideas that have developed in NMT over the past 3 years and some of the open problems in this area. - Seminar: presentations by students - [__week08__](./week08_multitask) __Transfer learning and Multi-task learning__ - Lecture: What and why does a network learn: "model" is never just "model"! Transfer learning in NLP. Multi-task learning in NLP. How to understand, what kind of information the model representations contain. - Seminar: Improving named entity recognition by learning jointly with other tasks - [__week09__](./week09_da) __Domain Adaptation__ - Lecture: General theory. Instance weighting. Proxy-labels methods. Feature matching methods. Distillation-like methods. - Seminar: Adapting general machine translation model to a specific domain. - [__week10__](./week10_dialogue) __Dialogue Systems__ - Lecture: Task-oriented vs general conversation systems. Overview of a framework for task-oriented systems. General conversation: retrieval and generative approaches. Generative models for general conversation. Retrieval-based models for general conversation. - Seminar: Simple retrieval-based question answering - [__week11__](./week11_gan) __Adversarial learning & Latent Variables for NLP__ - Lecture: generative models recap, generative adversarial networks, variational autoencoders and why should you care about them. - Seminar: semi-supervised dictionary learning with adversarial networks - [__week12__](./week12_summarization) __Text Summarization__ - Lecture: Text summarization methods. Extractive vs abstractive. A piece of extractive text summarization. Abstractive text summarization. # Contributors & course staff Course materials and teaching performed by - [Elena Voita](https://lena-voita.github.io) - course admin, lectures, seminars, homeworks - [Boris Kovarsky](https://github.com/kovarsky) - lectures, seminars, homeworks - [David Talbot](https://github.com/drt7) - lectures, seminars, homeworks - [Sergey Gubanov](https://github.com/esgv) - lectures, seminars, homeworks - [Just Heuristic](https://github.com/justheuristic) - lectures, seminars, homeworks