# BigData **Repository Path**: www_321cto/BigData ## Basic Information - **Project Name**: BigData - **Description**: 专注大数据技术、算法、面试、项目等相关知识(2020年不断更新中) - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-06-22 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # BigData-大数据系统学习(2020年完善中) > BigData-大数据系统学习(2020年持续更新中)
Hadoop体系 Spark Flink 算法 面试 其它

# :black_nib: 前 言 1. [大数据学习路线](#) 2. [大数据常用软件安装环境搭建指南](https://github.com/bigdata2018/BigData/blob/master/notes/%E5%A4%A7%E6%95%B0%E6%8D%AE%E8%BD%AF%E4%BB%B6%E5%AE%89%E8%A3%85%E5%8F%8A%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA.md) # # 一.Hadoop体系 1. [分布式文件存储系统 — HDFS](#) 2. [分布式计算框架 — MapReduce](#) 3. [集群资源管理器 — YARN](#) 4. [Zookeeper](#) 5. [Hive](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DDL操作.md) 6. [Flume](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive分区表和分桶表.md) 7. [Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive视图和索引.md) 8. [HBase](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DML操作.md) 9. [Sqoop](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive数据查询详解.md) 10. [Oozie](https://github.com/321cto/Java-for-Algorithms/blob/master/note/%E7%AE%97%E6%B3%95001.md#01%E5%AD%97%E7%AC%A6%E7%BB%9F%E8%AE%A1) 11. [Azkaban](https://github.com/321cto/Java-for-Algorithms/blob/master/note/%E7%AE%97%E6%B3%95001.md#02%E5%86%92%E6%B3%A1%E6%8E%92%E5%BA%8F) 12. [Kettle](https://github.com/heibaiying/BigData-Notes/blob/master/notes/HiveCLI和Beeline命令行的基本使用.md) 13. [ClickHouse](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DDL操作.md) 14. [DataX](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive分区表和分桶表.md) 15. [Impala](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive视图和索引.md) 16. [Atlas](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DML操作.md) 17. [ELK](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive数据查询详解.md) 18. [redis](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive数据查询详解.md) 19. [Hue](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive数据查询详解.md) # 二、Spark **Spark 基础 :** 1. [Spark 概念](https://github.com/bigdata2018/BigData/blob/master/notes/Spark%20%E6%A6%82%E5%BF%B5.md) 2. [Spark 集群搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Spark开发环境搭建.md) 3. [Spark入门案例](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_RDD.md) **Spark Core :** 3. [弹性式数据集 RDD](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_RDD.md) 4. [RDD 常用算子详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Transformation和Action算子.md) 5. [Spark 运行模式与作业提交](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark部署模式与作业提交.md) 6. [Spark 累加器与广播变量](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark累加器与广播变量.md) 7. [基于 Zookeeper 搭建 Spark 高可用集群](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Spark集群环境搭建.md) **Spark SQL :** 1. [DateFrame 和 DataSet ](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL_Dataset和DataFrame简介.md) 2. [Structured API 的基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Structured_API的基本使用.md) 3. [Spark SQL 外部数据源](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL外部数据源.md) 4. [Spark SQL 常用聚合函数](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL常用聚合函数.md) 5. [Spark SQL JOIN 操作](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL联结操作.md) **Spark Streaming :** 1. [Spark Streaming 简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming与流处理.md) 2. [Spark Streaming 基本操作](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming基本操作.md) 3. [Spark Streaming 整合 Flume](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming整合Flume.md) 4. [Spark Streaming 整合 Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming整合Kafka.md) **Structured Streaming :** # 三.Flink 1. [Storm 和流处理简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm和流处理简介.md) 2. [Storm 核心概念详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm核心概念详解.md) 3. [Storm 单机环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Storm单机环境搭建.md) 4. [Storm 集群环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Storm集群环境搭建.md) 5. [Storm 编程模型详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm编程模型详解.md) 6. [Storm 项目三种打包方式对比分析](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm三种打包方式对比分析.md) 7. [Storm 集成 Redis 详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成Redis详解.md) 8. [Storm 集成 HDFS/HBase](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成HBase和HDFS.md) 9. [Storm 集成 Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成Kakfa.md) # 四.算法 1. [常见算法](https://github.com/bigdata2018/BigData/blob/master/Algorithm-notes/%E5%B8%B8%E8%A7%81%E7%AE%97%E6%B3%95.md) 2. [leetCode](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md) 3. [企业算法题](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md) # 五.面试 1. [大数据应用常用打包方式](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md) # 六.其它 1. [大数据应用常用打包方式](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md) # :bookmark_tabs: 后 记 [资料分享与开发工具推荐](https://github.com/heibaiying/BigData-Notes/blob/master/notes/资料分享与工具推荐.md)