# mlcourse.ai **Repository Path**: zerogau/mlcourse.ai ## Basic Information - **Project Name**: mlcourse.ai - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2022-05-05 - **Last Updated**: 2024-10-12 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
![ODS stickers](https://github.com/Yorko/mlcourse.ai/blob/master/img/ods_stickers.jpg) **[mlcourse.ai](https://mlcourse.ai) – Open Machine Learning Course** [![License: CC BY-NC-SA 4.0](https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-green)](https://creativecommons.org/licenses/by-nc-sa/4.0/) [![Slack](https://img.shields.io/badge/slack-ods.ai-orange)](https://opendatascience.slack.com/archives/C91N8TL83/p1567408586359500) [![Donate](https://img.shields.io/badge/support-patreon-red)](https://www.patreon.com/ods_mlcourse) [![Donate](https://img.shields.io/badge/support-ko--fi-red)](https://ko-fi.com/mlcourse_ai)
[mlcourse.ai](https://mlcourse.ai) is an open Machine Learning course by [OpenDataScience](https://ods.ai). The course is designed to perfectly balance theory and practice. You can take part in several Kaggle Inclass competitions held during the course. From spring 2017 to fall 2019, 6 sessions of mlcourse.ai took place - 26k participants applied, 10k converted to passing the first assignment, about 1500 participants finished the course. Currently, the course is in self-paced mode. Check out a thorough [Roadmap](https://mlcourse.ai/roadmap) guiding you through the self-paced mlcourse.ai. Mirrors (:uk:-only): [mlcourse.ai](https://mlcourse.ai) (main site), [Kaggle Dataset](https://www.kaggle.com/kashnitsky/mlcourse) (same notebooks as Kaggle Notebooks) ### Self-paced passing The [Roadmap](https://mlcourse.ai/roadmap) will guide you through 11 weeks of mlcourse.ai. For each week, from Pandas to Gradient Boosting, instructions are given on what artciles to read, lectures to watch, what assignments to accomplish. ### Articles This is the list of published articles on medium.com [:uk:](https://medium.com/open-machine-learning-course), habr.com [:ru:](https://habr.com/company/ods/blog/344044/). Also notebooks in Chinese are mentioned :cn: and links to Kaggle Notebooks (in English) are given. Icons are clickable. 1. Exploratory Data Analysis with Pandas [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-1-exploratory-data-analysis-with-pandas-de57880f1a68) [:ru:](https://habrahabr.ru/company/ods/blog/322626/) [:cn:](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_chinese/topic01-%E4%BD%BF%E7%94%A8-Pandas-%E8%BF%9B%E8%A1%8C%E6%95%B0%E6%8D%AE%E6%8E%A2%E7%B4%A2.ipynb), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/topic-1-exploratory-data-analysis-with-pandas) 2. Visual Data Analysis with Python [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-2-visual-data-analysis-in-python-846b989675cd) [:ru:](https://habrahabr.ru/company/ods/blog/323210/) [:cn:](http://nbviewer.ipython.org/urls/raw.github.com/Yorko/mlcourse.ai/master/jupyter_chinese/topic02-Python-%E6%95%B0%E6%8D%AE%E5%8F%AF%E8%A7%86%E5%8C%96%E5%88%86%E6%9E%90.ipynb), Kaggle Notebooks: [part1](https://www.kaggle.com/kashnitsky/topic-2-visual-data-analysis-in-python), [part2](https://www.kaggle.com/kashnitsky/topic-2-part-2-seaborn-and-plotly) 3. Classification, Decision Trees and k Nearest Neighbors [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-3-classification-decision-trees-and-k-nearest-neighbors-8613c6b6d2cd) [:ru:](https://habrahabr.ru/company/ods/blog/322534/) [:cn:](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_chinese/topic03-%E5%86%B3%E7%AD%96%E6%A0%91%E5%92%8C-K-%E8%BF%91%E9%82%BB%E5%88%86%E7%B1%BB.ipynb), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/topic-3-decision-trees-and-knn) 4. Linear Classification and Regression [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-4-linear-classification-and-regression-44a41b9b5220) [:ru:](https://habrahabr.ru/company/ods/blog/323890/) [:cn:](http://nbviewer.ipython.org/urls/raw.github.com/Yorko/mlcourse.ai/master/jupyter_chinese/topic04-%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92%E5%92%8C%E7%BA%BF%E6%80%A7%E5%88%86%E7%B1%BB%E5%99%A8.ipynb), Kaggle Notebooks: [part1](https://www.kaggle.com/kashnitsky/topic-4-linear-models-part-1-ols), [part2](https://www.kaggle.com/kashnitsky/topic-4-linear-models-part-2-classification), [part3](https://www.kaggle.com/kashnitsky/topic-4-linear-models-part-3-regularization), [part4](https://www.kaggle.com/kashnitsky/topic-4-linear-models-part-4-more-of-logit), [part5](https://www.kaggle.com/kashnitsky/topic-4-linear-models-part-5-validation) 5. Bagging and Random Forest [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-5-ensembles-of-algorithms-and-random-forest-8e05246cbba7) [:ru:](https://habrahabr.ru/company/ods/blog/324402/) [:cn:](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_chinese/topic05-%E9%9B%86%E6%88%90%E5%AD%A6%E4%B9%A0%E5%92%8C%E9%9A%8F%E6%9C%BA%E6%A3%AE%E6%9E%97%E6%96%B9%E6%B3%95.ipynb), Kaggle Notebooks: [part1](https://www.kaggle.com/kashnitsky/topic-5-ensembles-part-1-bagging), [part2](https://www.kaggle.com/kashnitsky/topic-5-ensembles-part-2-random-forest), [part3](https://www.kaggle.com/kashnitsky/topic-5-ensembles-part-3-feature-importance) 6. Feature Engineering and Feature Selection [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-6-feature-engineering-and-feature-selection-8b94f870706a) [:ru:](https://habrahabr.ru/company/ods/blog/325422/) [:cn:](http://nbviewer.ipython.org/urls/raw.github.com/Yorko/mlcourse.ai/master/jupyter_chinese/topic06-%E7%89%B9%E5%BE%81%E5%B7%A5%E7%A8%8B%E5%92%8C%E7%89%B9%E5%BE%81%E9%80%89%E6%8B%A9.ipynb), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/topic-6-feature-engineering-and-feature-selection) 7. Unsupervised Learning: Principal Component Analysis and Clustering [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-7-unsupervised-learning-pca-and-clustering-db7879568417) [:ru:](https://habrahabr.ru/company/ods/blog/325654/) [:cn:](http://nbviewer.ipython.org/urls/raw.github.com/Yorko/mlcourse.ai/master/jupyter_chinese/topic07-%E4%B8%BB%E6%88%90%E5%88%86%E5%88%86%E6%9E%90%E5%92%8C%E8%81%9A%E7%B1%BB.ipynb), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/topic-7-unsupervised-learning-pca-and-clustering) 8. Vowpal Wabbit: Learning with Gigabytes of Data [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-8-vowpal-wabbit-fast-learning-with-gigabytes-of-data-60f750086237) [:ru:](https://habrahabr.ru/company/ods/blog/326418/) [:cn:](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_chinese/topic08-%E9%9A%8F%E6%9C%BA%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E5%92%8C%E7%8B%AC%E7%83%AD%E7%BC%96%E7%A0%81.ipynb), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/topic-8-online-learning-and-vowpal-wabbit) 9. Time Series Analysis with Python, part 1 [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-9-time-series-analysis-in-python-a270cb05e0b3) [:ru:](https://habrahabr.ru/company/ods/blog/327242/) [:cn:](http://nbviewer.ipython.org/urls/raw.github.com/Yorko/mlcourse.ai/master/jupyter_chinese/topic09-%E6%97%B6%E9%97%B4%E5%BA%8F%E5%88%97%E5%A4%84%E7%90%86%E4%B8%8E%E5%BA%94%E7%94%A8.ipynb). Predicting future with Facebook Prophet, part 2 [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-9-part-3-predicting-the-future-with-facebook-prophet-3f3af145cdc), [:cn:](http://nbviewer.ipython.org/urls/raw.github.com/Yorko/mlcourse.ai/master/jupyter_chinese/topic09-%E6%97%B6%E9%97%B4%E5%BA%8F%E5%88%97%E5%A4%84%E7%90%86%E4%B8%8E%E5%BA%94%E7%94%A8.ipynb) Kaggle Notebooks: [part1](https://www.kaggle.com/kashnitsky/topic-9-part-1-time-series-analysis-in-python), [part2](https://www.kaggle.com/kashnitsky/topic-9-part-2-time-series-with-facebook-prophet) 10. Gradient Boosting [:uk:](https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-10-gradient-boosting-c751538131ac) [:ru:](https://habrahabr.ru/company/ods/blog/327250/), [:cn:](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_chinese/topic05-%E9%9B%86%E6%88%90%E5%AD%A6%E4%B9%A0%E5%92%8C%E9%9A%8F%E6%9C%BA%E6%A3%AE%E6%9E%97%E6%96%B9%E6%B3%95.ipynb), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/topic-10-gradient-boosting) ### Lectures Videolectures are uploaded to [this](https://bit.ly/2zY6Xe2) YouTube playlist. Introduction, [video](https://www.youtube.com/watch?v=DrohHdQa8u8), [slides](https://www.slideshare.net/festline/mlcourseai-fall2019-live-session-0) 1. Exploratory data analysis with Pandas, [video](https://youtu.be/fwWCw_cE5aI) 2. Visualization, main plots for EDA, [video](https://www.youtube.com/watch?v=WNoQTNOME5g) 3. Decision trees: [theory](https://youtu.be/H4XlBTPv5rQ) and [practical part](https://youtu.be/RrVYO6Td9Js) 4. Logistic regression: [theoretical foundations](https://www.youtube.com/watch?v=l3jiw-N544s), [practical part](https://www.youtube.com/watch?v=7o0SWgY89i8) (baselines in the "Alice" competition) 5. Ensembles and Random Forest – [part 1](https://www.youtube.com/watch?v=neXJL-AqI_c). Classification metrics – [part 2](https://www.youtube.com/watch?v=aBOMYqGUlWQ). Example of a business task, predicting a customer payment – [part 3](https://www.youtube.com/watch?v=FmKU-1LZGoE) 6. Linear regression and regularization - [theory](https://youtu.be/ne-MfRfYs_c), LASSO & Ridge, LTV prediction - [practice](https://youtu.be/B8yIaIEMyIc) 7. Unsupervised learning - [Principal Component Analysis](https://youtu.be/-AswHf7h0I4) and [Clustering](https://youtu.be/eVplCo-w4XE) 8. Stochastic Gradient Descent for classification and regression - [part 1](https://youtu.be/EUSXbdzaQE8), part 2 TBA 9. Time series analysis with Python (ARIMA, Prophet) - [video](https://youtu.be/_9lBwXnbOd8) 10. Gradient boosting: basic ideas - [part 1](https://youtu.be/g0ZOtzZqdqk), key ideas behind Xgboost, LightGBM, and CatBoost + practice - [part 2](https://youtu.be/V5158Oug4W8) ### Assignments 1. Exploratory data analysis with Pandas, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment01_pandas_uci_adult.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-1-pandas-and-uci-adult-dataset), [solution](https://www.kaggle.com/kashnitsky/a1-demo-pandas-and-uci-adult-dataset-solution) 2. Analyzing cardiovascular disease data, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment02_analyzing_cardiovascular_desease_data.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-2-analyzing-cardiovascular-data), [solution](https://www.kaggle.com/kashnitsky/a2-demo-analyzing-cardiovascular-data-solution) 3. Decision trees with a toy task and the UCI Adult dataset, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment03_decision_trees.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-3-decision-trees), [solution](https://www.kaggle.com/kashnitsky/a3-demo-decision-trees-solution) 4. Sarcasm detection, [Kaggle Notebook](https://www.kaggle.com/kashnitsky/a4-demo-sarcasm-detection-with-logit), [solution](https://www.kaggle.com/kashnitsky/a4-demo-sarcasm-detection-with-logit-solution). Linear Regression as an optimization problem, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment04_linreg_optimization.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-4-linear-regression-as-optimization) 5. Logistic Regression and Random Forest in the credit scoring problem, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment05_logit_rf_credit_scoring.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-5-logit-and-rf-for-credit-scoring), [solution](https://www.kaggle.com/kashnitsky/a5-demo-logit-and-rf-for-credit-scoring-sol) 6. Exploring OLS, Lasso and Random Forest in a regression task, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment06_regression_wine.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-6-linear-models-and-rf-for-regression), [solution](https://www.kaggle.com/kashnitsky/a6-demo-regression-solution) 7. Unsupervised learning, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment07_unsupervised_learning.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-7-unupervised-learning), [solution](https://www.kaggle.com/kashnitsky/a7-demo-unsupervised-learning-solution) 8. Implementing online regressor, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment08_implement_sgd_regressor.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-8-implementing-online-regressor), [solution](https://www.kaggle.com/kashnitsky/a8-demo-implementing-online-regressor-solution) 9. Time series analysis, [nbviewer](https://nbviewer.jupyter.org/github/Yorko/mlcourse.ai/blob/master/jupyter_english/assignments_demo/assignment09_time_series.ipynb?flush_cache=true), [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-9-time-series-analysis), [solution](https://www.kaggle.com/kashnitsky/a9-demo-time-series-analysis-solution) 10. Beating baseline in a competition, [Kaggle Notebook](https://www.kaggle.com/kashnitsky/assignment-10-gradient-boosting-and-flight-delays) ### Kaggle competitions 1. Catch Me If You Can: Intruder Detection through Webpage Session Tracking. [Kaggle Inclass](https://www.kaggle.com/c/catch-me-if-you-can-intruder-detection-through-webpage-session-tracking2) 2. DotA 2 winner prediction. [Kaggle Inclass](https://www.kaggle.com/c/mlcourse-dota2-win-prediction) ### Community Discussions are held in the **#mlcourse_ai** channel of the [OpenDataScience (ods.ai)](https://ods.ai) Slack team. *The course is free but you can support organizers by making a pledge on [Patreon](https://www.patreon.com/ods_mlcourse) (monthly support) or a one-time payment on [Ko-fi](https://ko-fi.com/mlcourse_ai). Thus you'll foster the spread of Machine Learning in the world!* [![Donate](https://img.shields.io/badge/support-patreon-red)](https://www.patreon.com/ods_mlcourse) [![Donate](https://img.shields.io/badge/support-ko--fi-red)](https://ko-fi.com/mlcourse_ai)