# Datasets **Repository Path**: Sheng-12138/Datasets ## Basic Information - **Project Name**: Datasets - **Description**: 此存储库包含 MachineLearningMastery.com 教程中使用的机器学习数据集的副本。 创建此存储库是为了确保教程中使用的数据集保持可用,并且不依赖于不可靠的第三方。 所有回归和分类问题 CSV 文件都没有标题行,列之间没有空格,目标是最后一列,缺失值用问号字符 ('?') 标记。 在许多情况下,教程将直接链接到原始数据集 URL,因此数据集文件名在添加到存储库后不应更改。 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 2 - **Created**: 2023-01-18 - **Last Updated**: 2023-01-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README Machine Learning Datasets ========================= This repository contains a copy of machine learning datasets used in tutorials on MachineLearningMastery.com. This repository was created to ensure that the datasets used in tutorials remain available and are not dependent upon unreliable third parties. All regression and classification problem CSV files have no header line, no whitespace between columns, the target is the last column, and missing values are marked with a question mark character ('?'). In many cases, tutorials will link directly to the raw dataset URL, therefore dataset filenames should not be changed once added to the repository. Datasets ======== This section provides a summary of the datasets in this repository. ## Binary Classification Datasets * Breast Cancer (Wisconsin) (breast-cancer-wisconsin.csv) * Breast Cancer (Yugoslavia) (breast-cancer.csv) * Breast Cancer (Haberman's) (haberman.csv) * Bank Note Authentication (banknote_authentication.csv) * Horse Colic (horse-colic.csv) * Ionosphere (ionosphere.csv) * Pima Indians Diabetes (pima-indians-diabetes.csv) * Sonar Returns (sonar.csv) * German Credit (german.csv) * Credit Card Fraud (creditcard.csv.zip) * Adult Income (adult-all.csv) * Mammography (mammography.csv) * Oil Spill (oil-spill.csv) * Phoneme (phoneme.csv) ## Multiclass Classification Datasets * Glass Identification (glass.csv) * Iris Flower Species (iris.csv) * Wheat Seeds (wheat-seeds.csv) * Wine (wine.csv) * Ecoli (ecoli.csv) * Thyroid Gland (new-thyroid.csv) ## Regression Datasets * Boston Housing (housing.csv) * Auto Insurance Total Claims (auto-insurance.csv) * Auto Imports Prices (auto_imports.csv) * Abalone Age (abalone.csv) * Wine Quality Red (winequality-red.csv) * Wine Quality White (winequality-white.csv) ## Univariate Time Series Datasets * Daily Minimum Temperatures in Melbourne (daily-min-temperatures.csv) * Daily Maximum Temperatures in Melbourne (daily-max-temperatures.csv) * Daily Female Births in California (daily-total-female-births.csv) * Monthly International Airline Passengers (monthly-airline-passengers.csv) * Monthly Armed Robberies in Boston (monthly-robberies.csv) * Monthly Sunspots (monthly-sunspots.csv) * Monthly Champagne Sales (monthly_champagne_sales.csv) * Monthly Shampoo Sales (monthly-shampoo-sales.csv) * Monthly Car Sales (monthly-car-sales.csv) * Monthly Mean Temperatures in Nottingham Castle (monthly-mean-temp.csv) * Monthly Specialty Writing Paper Sales (monthly-writing-paper-sales.csv) * Yearly Water Usage in Baltimore (yearly-water-usage.csv) ## Multivariate Time Series Datasets * Hourly Pollution Levels in Beijing (pollution.csv) * Minutely Individual Household Electric Power Consumption (household_power_consumption.zip) * Human Activity Recognition Using Smartphones (HAR_Smartphones.zip) * Indoor Movement Prediction (IndoorMovement.zip) * Yearly Longley Economic Employment (longley.csv) ## Natural Language Processing * Flickr 8k Photo Caption Dataset (Flickr8k_Dataset.zip, Flickr8k_text.zip) * Movie Review Polarity (review_polarity.tar.gz) * German to English Translation (deu-eng.txt) * The Republic, by Plato (republic.txt) ## ARFF Datasets * Weka UCI Datasets (weka-datasets.zip) * Weka Numeric Datasets (weka-datasets-numeric.zip)