梯子加速器:https://www.svpn.me/
一、搭建1、准备插件yum install -y epel-release yum install -y psmisc nc net-tools rsync vim lrzsz ntp libzstd openssl-
static tree iotop git2、关闭防火墙systemctlstop firewalldsystemctldisable firewalld3、创建用户useraddbigdatapasswd
bigdata4、配置用户权限vim /etc/sudoers ## Allow root to run any commands anywhererootALL=(ALL) ALLbigdata
ALL=(ALL) NOPASSWD:ALL5、在/opt目录下创建文件夹,并修改所属主和所属组mkdir /opt/module mkdir /opt/software chown bigdata:bigdata /opt/module
chown bigdata:bigdata /opt/software6、卸载虚拟机自带的open JDKrpm -qa | grep -i java | xargs -n1 rpm -e --nodeps
7、修改克隆虚拟机的静态IPvim /etc/sysconfig/network-scripts/ifcfg-ens33 改成 DEVICE=ens33 TYPE=Ethernet ONBOOT=yes BOOTPROTO=static NAME=
"ens33" IPADDR=192.168.1.102 PREFIX=24 GATEWAY=192.168.1.2 DNS1=192.168.1.28、查看Linux虚拟机的虚拟网络编辑器,编辑->虚拟网络编辑器->VMnet8
9、修改克隆机主机名 hostnamectl --staticset-hostname bigdatavim /etc/hostname bigdata10、配置linux克隆机主机名称映射hosts文件,打开/etc/hosts
vim/etc/hosts192.168.1.102bigdata11、如果操作系统是window10,先拷贝出来,修改保存以后,再覆盖即可(a)进入C:\Windows\System32\drivers\etc路径
(b)拷贝hosts文件到桌面(c)打开桌面hosts文件并添加如下内容192.168.1.102bigdata12、安装jdk1、在Linux系统下的opt目录中查看软件包是否导入成功ls /opt/software/
hadoop-3.1.3.tar.gz jdk-8u212-linux-x64.tar.gz2、解压tar -zxvf jdk-8u212-linux-x64.tar.gz -C /opt/module
/3、配置JDK环境变量vim /etc/profile.d/my_env.sh #JAVA_HOMEexport JAVA_HOME=/opt/module/jdk1.8.0_212 export PATH=
$PATH:$JAVA_HOME/bin4、加载环境变量source /etc/profile5、测试JDK是否安装成功java -version13、安装hadoop1、解压tar -zxvf hadoop-
3.1.3.tar.gz -C /opt/module/ ls /opt/module/ hadoop-3.1.32、配置环境 vim /etc/profile.d/my_env.sh #HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-3.1.3 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME
/sbin source /etc/profile3、测试是否安装成功hadoopversionHadoop 3.1.314、本地模式1、创建在hadoop-3.1.3文件下面创建一个wcinput文件夹
mkdir wcinput2、在wcinput文件下创建一个word.txt文件cdwcinputvimword.txthadoopyarnhadoopmapreduce3、回到Hadoop目录/opt/module/hadoop-3.1.3
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount wcinput wcoutput4、验证结果
catwcoutput/part-r-00000hadoop2mapreduce1yarn115、完全分布式运行模式1、分发jdk和hadoop到bigdata1和bigdata2scp -r /opt/
module/jdk1.8.0_212 bigdata@bigdata1:/opt/modulescp -r /opt/module/hadoop-3.1.3 bigdata@bigdata1:/opt/
module2、xsync集群分发脚本cd /home/bigdata mkdir bin cd bin vim xsync #!/bin/bash#1. 判断参数个数if [ $# -lt 1 ] then
echo Not Enough Arguement! exit; fi#2. 遍历集群所有机器for host in bigdata bigdata1 bigdata2 doecho ====================
$host ==================== #3. 遍历所有目录,挨个发送for file in$@do#4. 判断文件是否存在if [ -e $file ] then#5. 获取父目录
pdir=$(cd -P $(dirname $file); pwd) #6. 获取当前文件的名称 fname=$(basename $file) ssh
$host"mkdir -p $pdir" rsync -av $pdir/$fname$host:$pdirelseecho$file does not exists! fidone
donechmod +x xsynccp xsync /bin/xsync /home/bigdata/bin3、无密登陆ssh-keygen -t rsa 一直下一步将公钥拷贝到要免密登录的目标机器上 ssh-
copy-id bigdata ssh-copy-id bigdata1 ssh-copy-id bigdata24、配置集群1、配置core-site.xmlcd $HADOOP_HOME/etc/hadoop vim core-site.xml
>fs.defaultFShdfs://bigdata:9820
>hadoop.tmp.dir/opt/module/hadoop-3.1.3/data
>hadoop.http.staticuser.user
>bigdatahadoop.proxyuser.bigdata.hosts
*hadoop.proxyuser.bigdata.groups
*hadoop.proxyuser.bigdata.groups
*2、配置hdfs-site.xmlvim hdfs-site.xml
>dfs.namenode.http-addressbigdata:9870
>dfs.namenode.secondary.http-addressbigdata1:9868
>3、配置yarn-site.xmlvim yarn-site.xml
yarn.nodemanager.aux-servicesmapreduce_shuffle
【科学翻墙上网地址:vpn.av1o.com】
yarn.resourcemanager.hostname
>bigdata2yarn.nodemanager.env-whitelist
>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
yarn.scheduler.minimum-allocation-mb
>512yarn.scheduler.maximum-allocation-mb4096