# kaggle-exercise **Repository Path**: gethug/kaggle-exercise ## Basic Information - **Project Name**: kaggle-exercise - **Description**: The exercises of ML and AI in https://www.kaggle.com website - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-02-18 - **Last Updated**: 2024-03-03 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Kaggle 课程练习 ## 目的 记录在 [https://www.kaggle.com/](https://www.kaggle.com/) 网站学习ML和AI课程的练习作业,以便后续编程时可以回看和参考。 ## 资料收集 + [对 LabelEncoder,Ordinal Encoder,OnehotEncoder 区别的解释](https://www.zhihu.com/question/421194789) + 对fit_transform() 与 transform() 用法的解释 : [What and why behind fit_transform() and transform() in scikit-learn!](https://towardsdatascience.com/what-and-why-behind-fit-transform-vs-transform-in-scikit-learn-78f915cf96fe) + 熵,交叉熵,KL散度的解释 :[A Short Introduction to Entropy, Cross-Entropy and KL-Divergence](https://www.youtube.com/watch?v=ErfnhcEV1O8) + SimpleImputer 的使用 :[How To Use Sklearn SimpleImputer for Filling Missing Values in Dataset](https://machinelearningknowledge.ai/how-to-use-sklearn-simple-imputer-simpleimputer-for-filling-missing-values-in-dataset/) + 如何创建好的验证数据集 :[How (and why) to create a good validation set](https://www.fast.ai/posts/2017-11-13-validation-sets.html) + 参数问题是AI的一个大问题 :[The problem with metrics is a big problem for AI](https://www.fast.ai/posts/2019-09-24-metrics.html) + 皮尔逊相关系数用来衡量两个连续变量之间线性相关程度的统计量 [Pearson correlation coefficient](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient) ## 博客收集 + [Sukanya Bag](https://github.com/sukanyabag) + [Chetna Khanna's blog](https://chetnakhanna.medium.com/) + [scikit-learn 中文社区](https://scikit-learn.org.cn/)