# jinken-bigdata-train **Repository Path**: opennlp/jinken-bigdata-train ## Basic Information - **Project Name**: jinken-bigdata-train - **Description**: 金肯大数据学习,五天课程 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-01-12 - **Last Updated**: 2023-01-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # jinken-bigdata-train ## 第一天:软件安装 ### 上午 #### 1.安装软件 ``` VMware® Workstation 16 Pro centos 7.7 java hadoop ``` #### 2.VMware * 基本信息 ``` version: VMware-workstation-full-16.2.3-19376536.exe 许可证密钥:ZF71R-DMX85-08DQY-8YMNC-PPHV8 链接:https://pan.baidu.com/s/1iF3BsPfbsz8cqmGdw3qvIA 提取码:eknw ``` * NAT网络信息 ``` 编辑->虚拟网络编辑器 ``` ![image-20230112193554087](./data/virtual_net_edit.png) ![image-20230112193916899](./data/virtual_net_edit_dhcp.png) * 相应主机上的网络信息 ![image-20230112194416264](./data/virtual_net_edit_dhcp_local.png) * 安装网络 ![image-20230116140745812](C:\Users\Administrator\AppData\Roaming\Typora\typora-user-images\image-20230116140745812.png) #### 3.centos * yum配置信息 ```bash [root@node102 ~]# cat /etc/centos-release CentOS Linux release 7.7.1908 (Core) [root@node102 ~]# curl https://mirrors.aliyun.com/repo/Centos-7.repo -o /etc/yum.repos.d/CentOS-Base.repo % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2523 100 2523 0 0 16251 0 --:--:-- --:--:-- --:--:-- 16277 ``` * network ```bash [root@node101 ~]# ip a 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens33: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:41:63:ef brd ff:ff:ff:ff:ff:ff inet 192.168.17.101/24 brd 192.168.17.255 scope global noprefixroute ens33 valid_lft forever preferred_lft forever inet6 fe80::8a33:bb73:fdb7:20c/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@node101 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens33 TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=ens33 UUID=9bd7be3b-b735-4f50-b15c-54ec3ac1db50 DEVICE=ens33 ONBOOT=yes IPADDR=192.168.17.101 PREFIX=24 GATEWAY=192.168.17.2 DNS1=192.168.17.2 DOMAIN=localdomain IPV6_PRIVACY=no [root@node101 ~]# [root@node101 ~]# cat /etc/hostname node101 ``` * 防火墙 ```bash [root@node101 ~]# systemctl status firewalld ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled) Active: active (running) since 四 2023-01-12 20:59:56 CST; 3h 34min ago Docs: man:firewalld(1) Main PID: 777 (firewalld) CGroup: /system.slice/firewalld.service └─777 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid 1月 12 20:59:54 node102 systemd[1]: Starting firewalld - dynamic firewall daemon... 1月 12 20:59:56 node102 systemd[1]: Started firewalld - dynamic firewall daemon. [root@node101 ~]# systemctl status firewalld ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man:firewalld(1) ``` * 可以免密登陆后 ``` [root@node101 ~]# ssh node102 "systemctl disable firewalld" [root@node101 ~]# ssh node102 "reboot" Connection to node102 closed by remote host. [root@node101 ~]# ssh node102 "systemctl status firewalld" ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man:firewalld(1) [root@node101 ~]# ssh node103 "systemctl disable firewalld" [root@node101 ~]# ssh node103 "reboot" Connection to node103 closed by remote host. [root@node101 ~]# ssh node103 "systemctl status firewalld" ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man:firewalld(1) ``` #### 4.hadoop * 官网 ``` https://archive.apache.org/dist/hadoop/common/hadoop-3.2.1/ https://hadoop.apache.org/docs/r3.2.1/index.html https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions Apache Hadoop from 3.0.x to 3.2.x now supports only Java 8 ``` * 安装 java ```bash [root@node101 software]# tar zxvf jdk-8u321-linux-x64.tar.gz -C /usr/local/ [root@node101 software]# vi /etc/profile export JAVA_HOME=/usr/local/jdk1.8.0_321 export PATH=$JAVA_HOME/bin:$PATH [root@node101 software]# java -version java version "1.8.0_321" Java(TM) SE Runtime Environment (build 1.8.0_321-b07) Java HotSpot(TM) 64-Bit Server VM (build 25.321-b07, mixed mode) ``` * 节点配置 ```bash [root@node101 software]# cat /etc/host host.conf hostname hosts hosts.allow hosts.deny [root@node101 software]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.17.101 node101 192.168.17.102 node102 192.168.17.103 node103 [root@node101 ~]# ssh-copy-id node101 [root@node101 ~]# ssh-copy-id node102 [root@node101 ~]# ssh-copy-id node103 [root@node101 ~]# cat ~/.ssh/authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC0Vuvx3gkxvx0E3V48yOgZvgxJwwyNYS1/oQgFjkAcpez4u9Pt7FnjjMdsaVXbt2rNxs5w+6wS4ES+aXrawqtI6juFWVd9jIaBDYISfluixBqj1G16qG6sfU4LIsCLxevtgk1/ibDbq/yx2LxuVXwPVOm7AutP2rUf2ejzPFk5qyxHK7CVQ6i/nyV/HnMLQDGRBELVu9iA/DSq0xBcGBdlGdEOFMhfUz9oZ8Pad/g/Eo7B4yFnJy7lip9zoLSJZLRuzPRDe9yZSoCdL8IfgXXBPfXc5gXH52XFLEQpP974bEIEUPdrWJXnTtPy/w4kzSxAj4PkVEDv3SOJ926hVJzn root@node101 [root@node102 ~]# cat ~/.ssh/authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC0Vuvx3gkxvx0E3V48yOgZvgxJwwyNYS1/oQgFjkAcpez4u9Pt7FnjjMdsaVXbt2rNxs5w+6wS4ES+aXrawqtI6juFWVd9jIaBDYISfluixBqj1G16qG6sfU4LIsCLxevtgk1/ibDbq/yx2LxuVXwPVOm7AutP2rUf2ejzPFk5qyxHK7CVQ6i/nyV/HnMLQDGRBELVu9iA/DSq0xBcGBdlGdEOFMhfUz9oZ8Pad/g/Eo7B4yFnJy7lip9zoLSJZLRuzPRDe9yZSoCdL8IfgXXBPfXc5gXH52XFLEQpP974bEIEUPdrWJXnTtPy/w4kzSxAj4PkVEDv3SOJ926hVJzn root@node101 [root@node103 ~]# cat ~/.ssh/authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC0Vuvx3gkxvx0E3V48yOgZvgxJwwyNYS1/oQgFjkAcpez4u9Pt7FnjjMdsaVXbt2rNxs5w+6wS4ES+aXrawqtI6juFWVd9jIaBDYISfluixBqj1G16qG6sfU4LIsCLxevtgk1/ibDbq/yx2LxuVXwPVOm7AutP2rUf2ejzPFk5qyxHK7CVQ6i/nyV/HnMLQDGRBELVu9iA/DSq0xBcGBdlGdEOFMhfUz9oZ8Pad/g/Eo7B4yFnJy7lip9zoLSJZLRuzPRDe9yZSoCdL8IfgXXBPfXc5gXH52XFLEQpP974bEIEUPdrWJXnTtPy/w4kzSxAj4PkVEDv3SOJ926hVJzn root@node101 ``` * 安装hadoop ```bash [root@node101 software]# tar zxvf hadoop-3.2.1.tar.gz -C /usr/local/ [root@node101 software]# vi /etc/profile export JAVA_HOME=/usr/local/jdk1.8.0_321 export HADOOP_HOME=/usr/local/hadoop-3.2.1 export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH [root@node101 software]# source /etc/profile [root@node101 software]# hadoop version Hadoop 3.2.1 Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r b3cbbb467e22ea829b3808f4b7b01d07e0bf3842 Compiled by rohithsharmaks on 2019-09-10T15:56Z Compiled with protoc 2.5.0 From source with checksum 776eaf9eee9c0ffc370bcbc1888737 This command was run using /usr/local/hadoop-3.2.1/share/hadoop/common/hadoop-common-3.2.1.jar ``` * 配置hadoop ```bash [root@node101 hadoop]# cd $HADOOP_HOME/etc/hadoop [root@node101 hadoop]# ls capacity-scheduler.xml core-site.xml hadoop-metrics2.properties hdfs-site.xml httpfs-signature.secret kms-env.sh log4j.properties mapred-queues.xml.template ssl-client.xml.example workers yarnservice-log4j.properties configuration.xsl hadoop-env.cmd hadoop-policy.xml httpfs-env.sh httpfs-site.xml kms-log4j.properties mapred-env.cmd mapred-site.xml ssl-server.xml.example yarn-env.cmd yarn-site.xml container-executor.cfg hadoop-env.sh hadoop-user-functions.sh.example httpfs-log4j.properties kms-acls.xml kms-site.xml mapred-env.sh shellprofile.d user_ec_policies.xml.template yarn-env.sh ``` * etc/hadoop/core-site.xml ```bash fs.defaultFS hdfs://node101:9820       hadoop.tmp.dir /usr/local/hadoop/tmp ``` * etc/hadoop/hdfs-site.xml ```bash dfs.replication 3 dfs.namenode.secondary.http-address node102:9868 dfs.namenode.http-address node101:9870 ``` * etc/hadoop/hadoop-env.sh ```bash export JAVA_HOME=/usr/local/jdk1.8.0_321 export HDFS_NAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root ``` * etc/hadoop/workers ```bash [root@node101 hadoop-3.2.1]# cat etc/hadoop/workers node101 node102 node103 ``` * 分发其他节点 ```bash [root@node101 software]# scp -r ~/software/ node102:~ hadoop-3.2.1.tar.gz 100% 343MB 143.9MB/s 00:02 jdk-8u321-linux-x64.tar.gz 100% 140MB 146.8MB/s 00:00 [root@node101 software]# scp -r ~/software/ node103:~ hadoop-3.2.1.tar.gz 100% 343MB 45.0MB/s 00:07 jdk-8u321-linux-x64.tar.gz 100% 140MB 41.6MB/s 00:03 [root@node101 software]# ssh node102 "ls ~/software" hadoop-3.2.1.tar.gz jdk-8u321-linux-x64.tar.gz [root@node101 software]# ssh node103 "ls ~/software" hadoop-3.2.1.tar.gz jdk-8u321-linux-x64.tar.gz [root@node101 software]# ssh node102 "tar zxvf ~/software/jdk-8u321-linux-x64.tar.gz -C /usr/local/" [root@node101 software]# ssh node103 "tar zxvf ~/software/jdk-8u321-linux-x64.tar.gz -C /usr/local/" [root@node101 software]# ssh node102 "tar zxvf ~/software/hadoop-3.2.1.tar.gz -C /usr/local/" [root@node101 software]# ssh node103 "tar zxvf ~/software/hadoop-3.2.1.tar.gz -C /usr/local/" [root@node101 software]# scp -r /usr/local/hadoop-3.2.1/etc/hadoop/ node102:/usr/local/hadoop-3.2.1/etc/ [root@node101 software]# scp -r /usr/local/hadoop-3.2.1/etc/hadoop/ node103:/usr/local/hadoop-3.2.1/etc/ [root@node101 software]# scp /etc/profile node102:/etc/ profile 100% 1970 1.7MB/s 00:00 [root@node101 software]# ssh node102 "source /etc/profile" [root@node101 software]# scp /etc/profile node103:/etc/ profile 100% 1970 1.5MB/s 00:00 [root@node101 software]# ssh node103 "source /etc/profile" [root@node101 ~]# scp /etc/hosts node102:/etc hosts 100% 237 280.4KB/s 00:00 [root@node101 ~]# scp /etc/hosts node103:/etc hosts 100% 237 286.3KB/s 00:00 ``` * 启动 ```bash [root@node101 software]# hdfs namenode -format /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = node101/192.168.17.101 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 3.2.1 ---- 2023-01-13 00:23:36,589 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at node101/192.168.17.101 ************************************************************/ [root@node101 sbin]# start-all.sh Starting namenodes on [node101] 上一次登录:五 1月 13 01:59:21 CST 2023pts/0 上 Starting datanodes 上一次登录:五 1月 13 01:59:31 CST 2023pts/0 上 Starting secondary namenodes [node102] 上一次登录:五 1月 13 01:59:33 CST 2023pts/0 上 Starting resourcemanager 上一次登录:五 1月 13 01:59:36 CST 2023pts/0 上 Starting nodemanagers 上一次登录:五 1月 13 01:59:40 CST 2023pts/0 上 [root@node101 sbin]# jps 55552 Jps 54419 DataNode 54196 NameNode 55191 NodeManager 54922 ResourceManager ``` * 页面 http://192.168.17.101:9870/dfshealth.html#tab-overview ![image-20230113004132178](./data/hdfs_start.png) * 页面 http://192.168.17.102:9868/status.html ![image-20230113171422318](./data/hdfs_start_secondary.png) ### 下午 #### 1.安装软件 ```bash mysql-community-server-8.0.26-1.el7.x86_64 hive zookeeper-3.6.3 apache-flume-1.9.0 kafka ``` #### 2.MySQL * 检查是否安装过同类的软件 ```bash rpm -qa | grep mariadb rpm -e mariadb-libs-5.5.64-1.el7.x86_64 --nodeps rpm -qa | grep mysql   ``` * 上传rpm安装包解压 ```bash [root@node101 software]# mkdir mysql-8.0.26-1.el7.x86_64 [root@node101 software]# tar xvf mysql-8.0.26-1.el7.x86_64.rpm-bundle.tar -C mysql-8.0.26-1.el7.x86_64 ``` * 安装 ```bash rpm -ivh mysql-community-common-8.0.26-1.el7.x86_64.rpm rpm -ivh mysql-community-client-plugins-8.0.26-1.el7.x86_64.rpm rpm -ivh mysql-community-libs-8.0.26-1.el7.x86_64.rpm rpm -ivh mysql-community-client-8.0.26-1.el7.x86_64.rpm yum install -y net-tools yum install -y perl yum -y install numactl rpm -ivh mysql-community-server-8.0.26-1.el7.x86_64.rpm ``` * 修改密码 ```bash [root@node101 mysql-8.0.26-1.el7.x86_64]# systemctl start mysqld [root@node101 mysql-8.0.26-1.el7.x86_64]# systemctl status mysqld ● mysqld.service - MySQL Server Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled) Active: active (running) since 五 2023-01-13 01:08:19 CST; 7s ago Docs: man:mysqld(8) http://dev.mysql.com/doc/refman/en/using-systemd.html Process: 51229 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS) Main PID: 51362 (mysqld) Status: "Server is operational" CGroup: /system.slice/mysqld.service └─51362 /usr/sbin/mysqld [root@node101 mysql-8.0.26-1.el7.x86_64]# grep password /var/log/mysqld.log 2023-01-12T17:08:16.563355Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: MTd1=lbLkv#7 ``` #### 3.Hive 参考PDF文档 #### 4.zookeeper-3.6.3 * 第一台安装 ```bash [root@node101 software]# wget -c https://archive.apache.org/dist/zookeeper/zookeeper-3.6.3/apache-zookeeper-3.6.3-bin.tar.gz [root@node101 software]# tar zxvf apache-zookeeper-3.6.3-bin.tar.gz -C /usr/local/ [root@node101 software]# vi /etc/profile export ZOOKEEPER_HOME=/usr/local/apache-zookeeper-3.6.3-bin export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH [root@node101 software]# source /etc/profile [root@node101 software]# cd $ZOOKEEPER_HOME/conf [root@node101 conf]# vi zoo.cfg tickTime=2000 # 定义的时间单元(单位毫秒),下面的两个值都是tickTime的倍数。 initLimit=10 # follower连接并同步leader的初始化连接时间。 syncLimit=5 # 心跳机制的时间(正常情况下的请求和应答的时间) dataDir=/usr/local/zookeeper/zkData # 修改zookeeper的存储路径,zkData目录一会要创建出来 clientPort=2181 # 客户端连接服务器的port添加三个服务器节点 3888后面不能跟空格,否则报错!!!! server.1=node101:2888:3888 server.2=node102:2888:3888 server.3=node103:2888:3888 [root@node101 conf]# mkdir /usr/local/apache-zookeeper-3.6.3-bin/zkData [root@node101 conf]# echo 1 > /usr/local/apache-zookeeper-3.6.3-bin/zkData/myid [root@node101 software]# zkServer.sh start ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.6.3-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@node101 software]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.6.3-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Error contacting service. It is probably not running. [root@node101 software]# ``` * 分发 ```bash [root@node101 software]# scp apache-zookeeper-3.6.3-bin.tar.gz node102:~/software apache-zookeeper-3.6.3-bin.tar.gz 100% 12MB 98.6MB/s 00:00 [root@node101 software]# scp apache-zookeeper-3.6.3-bin.tar.gz node103:~/software apache-zookeeper-3.6.3-bin.tar.gz 100% 12MB 87.9MB/s 00:00 [root@node101 software]# ssh node102 "tar zxvf ~/software/apache-zookeeper-3.6.3-bin.tar.gz -C /usr/local/" [root@node101 software]# ssh node103 "tar zxvf ~/software/apache-zookeeper-3.6.3-bin.tar.gz -C /usr/local/" [root@node101 software]# ssh node102 "mkdir /usr/local/apache-zookeeper-3.6.3-bin/zkData" [root@node101 software]# ssh node103 "mkdir /usr/local/apache-zookeeper-3.6.3-bin/zkData" [root@node101 software]# ssh node102 "echo 2 > /usr/local/apache-zookeeper-3.6.3-bin/zkData/myid" [root@node101 software]# ssh node103 "echo 3 > /usr/local/apache-zookeeper-3.6.3-bin/zkData/myid" [root@node101 software]# scp $ZOOKEEPER_HOME/conf/zoo.cfg node102:/usr/local/apache-zookeeper-3.6.3-bin/conf zoo.cfg 100% 1259 1.2MB/s 00:00 [root@node101 software]# scp $ZOOKEEPER_HOME/conf/zoo.cfg node103:/usr/local/apache-zookeeper-3.6.3-bin/conf zoo.cfg 100% 1259 1.1MB/s 00:00 [root@node101 software]# scp /etc/profile node102:/etc/ profile 100% 2115 1.7MB/s 00:00 [root@node101 software]# scp /etc/profile node103:/etc/ profile 100% 2115 2.0MB/s 00:00 [root@node101 software]# ssh node102 "source /etc/profile" [root@node101 software]# ssh node103 "source /etc/profile" [root@node101 software]# ssh node102 "source /etc/profile;zkServer.sh start" ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.6.3-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@node101 software]# ssh node103 "source /etc/profile;zkServer.sh start" ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.6.3-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@node101 conf]# ssh node101 "source /etc/profile;jps" 102611 QuorumPeerMain 76227 Kafka 54419 DataNode 54196 NameNode 55191 NodeManager 54922 ResourceManager 60606 Jps [root@node101 conf]# ssh node102 "source /etc/profile;jps" 51544 SecondaryNameNode 48345 DataNode 48857 QuorumPeerMain 49754 Kafka 51711 Jps [root@node101 conf]# ssh node103 "source /etc/profile;jps" 48451 QuorumPeerMain 49879 Jps 48076 DataNode [root@node101 conf]# [root@node101 conf]# ssh node101 "source /etc/profile;zkServer.sh status" ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.6.3-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: follower [root@node101 conf]# ssh node102 "source /etc/profile;zkServer.sh status" ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.6.3-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: leader [root@node101 conf]# ssh node103 "source /etc/profile;zkServer.sh status" ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.6.3-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: follower ``` #### 5.apache-flume-1.9.0 * 安装 ```bash [root@node101 software]# tar zxvf ~/software/apache-flume-1.9.0-bin.tar.gz -C /usr/local/ [root@node101 software]# vi /etc/profile export FLUME_HOME=/usr/local/apache-flume-1.9.0-bin export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$ZOOKEEPER_HOME/bin:$FLUME_HOME/bin:$PATH [root@node101 software]# source /etc/profile ``` * 配置 ```bash [root@node101 software]# cd $FLUME_HOME/conf [root@node101 conf]# cp flume-env.sh.template flume-env.sh [root@node101 conf]# vi flume-env.sh export JAVA_HOME=/usr/local/jdk1.8.0_321 ``` * 验证 ```bash [root@node101 conf]# flume-ng version Flume 1.9.0 Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git Revision: d4fcab4f501d41597bc616921329a4339f73585e Compiled by fszabo on Mon Dec 17 20:45:25 CET 2018 From source with checksum 35db629a3bda49d23e9b3690c80737f9 ``` #### 6.kafka * 安装 https://kafka.apache.org/downloads ```bash [root@node101 software]# wget -c https://archive.apache.org/dist/kafka/2.8.0/kafka_2.12-2.8.0.tgz [root@node101 software]# tar zxvf ~/software/kafka_2.12-2.8.0.tgz -C /usr/local/ ``` * 配置 ```bash [root@node101 software]# vi /etc/profile export KAFKA_HOME=/usr/local/kafka_2.12-2.8.0 export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$ZOOKEEPER_HOME/bin:$FLUME_HOME/bin:$KAFKA_HOME/bin:$PATH [root@node101 software]# source /etc/profile [root@node101 software]# cd $KAFKA_HOME/config [root@node101 config]# cp server.properties server.properties.bak [root@node101 config]# diff -u1 server.properties server.properties.bak --- server.properties 2023-01-13 03:47:20.013889361 +0800 +++ server.properties.bak 2023-01-13 03:44:20.797547580 +0800 @@ -20,3 +20,3 @@ # The id of the broker. This must be set to a unique integer for each broker. -broker.id=1 +broker.id=0 @@ -59,3 +59,3 @@ # A comma separated list of directories under which to store log files -log.dirs=/usr/local/kafka_2.12-2.8.0/logs +log.dirs=/tmp/kafka-logs @@ -122,3 +122,3 @@ # root directory for all kafka znodes. -zookeeper.connect=node101:2181,node102:2181,node103:2181 +zookeeper.connect=localhost:2181 ``` * 分发 ```bash [root@node101 software]# scp kafka_2.12-2.8.0.tgz node102:~/software/ kafka_2.12-2.8.0.tgz 100% 68MB 109.7MB/s 00:00 [root@node101 software]# scp kafka_2.12-2.8.0.tgz node103:~/software/ kafka_2.12-2.8.0.tgz 100% 68MB 122.1MB/s 00:00 [root@node101 software]# ssh node102 "tar zxvf ~/software/kafka_2.12-2.8.0.tgz -C /usr/local/" [root@node101 software]# ssh node103 "tar zxvf ~/software/kafka_2.12-2.8.0.tgz -C /usr/local/" [root@node101 software]# scp /etc/profile node102:/etc/ profile 100% 2245 2.1MB/s 00:00 [root@node101 software]# scp /etc/profile node103:/etc/ profile 100% 2245 2.3MB/s 00:00 ### 修改broker.id=2 [root@node101 config]# scp server.properties.dist node102:/usr/local/kafka_2.12-2.8.0/config/server.properties ### 修改broker.id=3 [root@node101 config]# scp server.properties.dist node103:/usr/local/kafka_2.12-2.8.0/config/server.properties [root@node101 config]# ssh node102 "grep broker /usr/local/kafka_2.12-2.8.0/config/server.properties" # The id of the broker. This must be set to a unique integer for each broker. broker.id=2 # Hostname and port the broker will advertise to producers and consumers. If not set, # the brokers. [root@node101 config]# ssh node103 "grep broker /usr/local/kafka_2.12-2.8.0/config/server.properties" # The id of the broker. This must be set to a unique integer for each broker. broker.id=3 # Hostname and port the broker will advertise to producers and consumers. If not set, # the brokers. [root@node101 config]# [root@node101 kafka_2.12-2.8.0]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties ``` * 运行 ```bash [root@node101 kafka_2.12-2.8.0]# ssh node101 "source /etc/profile;kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties" [root@node101 kafka_2.12-2.8.0]# ssh node102 "source /etc/profile;kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties" [root@node101 kafka_2.12-2.8.0]# ssh node103 "source /etc/profile;kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties" [root@node101 kafka_2.12-2.8.0]# ssh node101 "source /etc/profile;jps" 102611 QuorumPeerMain 76227 Kafka 54419 DataNode 54196 NameNode 55191 NodeManager 54922 ResourceManager 81438 Jps [root@node101 kafka_2.12-2.8.0]# ssh node102 "source /etc/profile;jps" 48345 DataNode 48857 QuorumPeerMain 49754 Kafka 49789 Jps [root@node101 kafka_2.12-2.8.0]# ssh node103 "source /etc/profile;jps" 48451 QuorumPeerMain 49237 Kafka 49273 Jps 48076 DataNode [root@node101 kafka_2.12-2.8.0]# ``` ## 第二天:脚本命令 ### 下午 * hadoop ```bash #!/bin/bash case $1 in "start"){ source /etc/profile; $HADOOP_HOME/sbin/start-dfs.sh $HADOOP_HOME/sbin/start-yarn.sh };; "stop"){ $HADOOP_HOME/sbin/stop-dfs.sh $HADOOP_HOME/sbin/stop-yarn.sh };; esac ``` * zookeeper ```bash #!/bin/bash case $1 in "start"){ for i in node101 node102 node103 do echo "-------$i zookeeper-------" ssh $i "source /etc/profile; $ZOOKEEPER_HOME/bin/zkServer.sh start" done };; "stop"){ for i in node101 node102 node103 do echo "-----$i zookeeper-------" ssh $i "source /etc/profile; $ZOOKEEPER_HOME/bin/zkServer.sh stop" done };; "status"){ for i in node101 node102 node103 do echo "------$i zookeeper-------" ssh $i "source /etc/profile; $ZOOKEEPER_HOME/bin/zkServer.sh status" done };; esac ``` * kafka ```bash #!/bin/bash case $1 in "start"){ for i in node101 node102 node103 do echo "--------启动$i Kafka---------" ssh $i "source /etc/profile; $KAFKA_HOME/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties" done };; "stop"){ for i in node101 node102 node103 do echo "--------停止$i kafka-------" ssh $i "source /etc/profile; $KAFKA_HOME/bin/kafka-server-stop.sh stop" done };; esac ``` ## 其他 ### 命令 ```bash [root@192 ~]# yum install git -y [root@192 software]# yum install wget -y [root@192 software]# yum install net-tools -y [root@192 gitee]# git clone https://gitee.com/opennlp/jinken-bigdata-train.git [root@192 software]# wget -c https://mirrors.bfsu.edu.cn/anaconda/miniconda/Miniconda3-py37_22.11.1-1-Linux-x86_64.sh --no-check-certificate [root@192 jinken-bigdata-train]# conda env remove -n hadoop_py3.7 [root@192 jinken-bigdata-train]# conda create -n hadoop_py3.7 python=3.7 (base) [hadoop@centos101 ~]$ conda config --set auto_activate_base false [root@192 jinken-bigdata-train]# conda activate hadoop_py3.7 (hadoop_py3.7) [root@192 software]# cat ~/.config/pip/pip.conf [global] trusted-host=mirrors.aliyun.com index-url=https://mirrors.aliyun.com/pypi/simple (hadoop_py3.7) [root@192 jinken-bigdata-train]# jupyter-lab --allow-root --no-browser --ip 0.0.0.0 [root@192 ~]# firewall-cmd --zone=public --add-port=8888/tcp --permanent success [root@192 ~]# firewall-cmd --reload success [hadoop@centos101 hadoop-3.2.1]$ sudo firewall-cmd --list-all [sudo] hadoop 的密码: public (active) target: default icmp-block-inversion: no interfaces: ens33 sources: services: dhcpv6-client ssh ports: 8888/tcp protocols: masquerade: no forward-ports: source-ports: icmp-blocks: rich rules: ```