# BigData
**Repository Path**: www_321cto/BigData
## Basic Information
- **Project Name**: BigData
- **Description**: 专注大数据技术、算法、面试、项目等相关知识(2020年不断更新中)
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-06-22
- **Last Updated**: 2020-12-19
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# BigData-大数据系统学习(2020年完善中)
> BigData-大数据系统学习(2020年持续更新中)
# :black_nib: 前 言
1. [大数据学习路线](#)
2. [大数据常用软件安装环境搭建指南](https://github.com/bigdata2018/BigData/blob/master/notes/%E5%A4%A7%E6%95%B0%E6%8D%AE%E8%BD%AF%E4%BB%B6%E5%AE%89%E8%A3%85%E5%8F%8A%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA.md)
#
# 一.Hadoop体系
1. [分布式文件存储系统 — HDFS](#)
2. [分布式计算框架 — MapReduce](#)
3. [集群资源管理器 — YARN](#)
4. [Zookeeper](#)
5. [Hive](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DDL操作.md)
6. [Flume](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive分区表和分桶表.md)
7. [Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive视图和索引.md)
8. [HBase](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DML操作.md)
9. [Sqoop](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive数据查询详解.md)
10. [Oozie](https://github.com/321cto/Java-for-Algorithms/blob/master/note/%E7%AE%97%E6%B3%95001.md#01%E5%AD%97%E7%AC%A6%E7%BB%9F%E8%AE%A1)
11. [Azkaban](https://github.com/321cto/Java-for-Algorithms/blob/master/note/%E7%AE%97%E6%B3%95001.md#02%E5%86%92%E6%B3%A1%E6%8E%92%E5%BA%8F)
12. [Kettle](https://github.com/heibaiying/BigData-Notes/blob/master/notes/HiveCLI和Beeline命令行的基本使用.md)
13. [ClickHouse](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DDL操作.md)
14. [DataX](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive分区表和分桶表.md)
15. [Impala](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive视图和索引.md)
16. [Atlas](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DML操作.md)
17. [ELK](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive数据查询详解.md)
18. [redis](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive数据查询详解.md)
19. [Hue](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive数据查询详解.md)
# 二、Spark
**Spark 基础 :**
1. [Spark 概念](https://github.com/bigdata2018/BigData/blob/master/notes/Spark%20%E6%A6%82%E5%BF%B5.md)
2. [Spark 集群搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Spark开发环境搭建.md)
3. [Spark入门案例](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_RDD.md)
**Spark Core :**
3. [弹性式数据集 RDD](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_RDD.md)
4. [RDD 常用算子详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Transformation和Action算子.md)
5. [Spark 运行模式与作业提交](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark部署模式与作业提交.md)
6. [Spark 累加器与广播变量](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark累加器与广播变量.md)
7. [基于 Zookeeper 搭建 Spark 高可用集群](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Spark集群环境搭建.md)
**Spark SQL :**
1. [DateFrame 和 DataSet ](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL_Dataset和DataFrame简介.md)
2. [Structured API 的基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Structured_API的基本使用.md)
3. [Spark SQL 外部数据源](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL外部数据源.md)
4. [Spark SQL 常用聚合函数](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL常用聚合函数.md)
5. [Spark SQL JOIN 操作](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL联结操作.md)
**Spark Streaming :**
1. [Spark Streaming 简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming与流处理.md)
2. [Spark Streaming 基本操作](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming基本操作.md)
3. [Spark Streaming 整合 Flume](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming整合Flume.md)
4. [Spark Streaming 整合 Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming整合Kafka.md)
**Structured Streaming :**
# 三.Flink
1. [Storm 和流处理简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm和流处理简介.md)
2. [Storm 核心概念详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm核心概念详解.md)
3. [Storm 单机环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Storm单机环境搭建.md)
4. [Storm 集群环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Storm集群环境搭建.md)
5. [Storm 编程模型详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm编程模型详解.md)
6. [Storm 项目三种打包方式对比分析](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm三种打包方式对比分析.md)
7. [Storm 集成 Redis 详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成Redis详解.md)
8. [Storm 集成 HDFS/HBase](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成HBase和HDFS.md)
9. [Storm 集成 Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成Kakfa.md)
# 四.算法
1. [常见算法](https://github.com/bigdata2018/BigData/blob/master/Algorithm-notes/%E5%B8%B8%E8%A7%81%E7%AE%97%E6%B3%95.md)
2. [leetCode](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md)
3. [企业算法题](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md)
# 五.面试
1. [大数据应用常用打包方式](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md)
# 六.其它
1. [大数据应用常用打包方式](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md)
# :bookmark_tabs: 后 记
[资料分享与开发工具推荐](https://github.com/heibaiying/BigData-Notes/blob/master/notes/资料分享与工具推荐.md)