1:集群域名配置
/etc/hosts文件增加项目
vim /etc/hosts
192.168.93.201 master201 192.168.93.202 slave202 192.168.93.203 slave203 192.168.93.204 slave204
2:新增hadoop用户。
如果直接用root来安装配置hadoop等软件,会有安全隐患。最好新增一个hadoop用户,把所有的任务交给这个用户。不过没有也没关系。
(1)增加hadoop用户
sudo adduser hadoop
passwd hadoop
账户密码都是hadoop方便记忆。
在创建hadoop用户的同时也创建了hadoop用户组,下面我们把hadoop用户加入到hadoop用户组
输入
sudo usermod -a -G hadoop hadoop
然后再把hadoop用户赋予root权限,让他可以使用sudo命令
切换到可以root的用户输入
sudo gedit /etc/sudoers
(2)给hadoop用户增加权限
sudo vi /etc/sudoers
在图形界面可以用第一个命令,是ubuntu自带的一个文字编辑器,终端命令界面使用第二个命令。有关vi编辑器的使用自行百度。
修改文件如下:
# User privilege specification
root ALL=(ALL) ALL
hadoop ALL=(ALL) ALL
保存退出,hadoop用户就拥有了root权限root ALL=(ALL) ALL
user ALL=(ALL) ALL
另外一种方法
visudo
3:网络设置,NAT模式,需要ping通网络
(1)配置网卡 vim /etc/sysconfig/network-scripts/ifcfg-eno16777736
TYPE=Ethernet BOOTPROTO=none DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=no IPV6_AUTOCONF=no IPV6_DEFROUTE=no IPV6_FAILURE_FATAL=no NAME=eno16777736 UUID=17a315a8-df51-4de0-a587-d1655acc8187 DEVICE=eno16777736 PEERDNS=yes PEERROUTES=yes IPV6_PEERDNS=no IPV6_PEERROUTES=no ONBOOT=yes IPADDR=192.168.93.202 PREFIX=24 GATEWAY=192.168.93.2 DNS=192.168.93.2
(2)更改域名服务
vim /etc/resolv.conf
nameserver 192.168.93.2
(3)测试是否连通
service network restart
ping www.baidu.com
(4)网络如果不连通
看一看是否是vmware的dhcp,NAT服务是否正常开启
vmware在windows中有vmware DHCP Service,vmware NAT Service
(5)关闭防火墙
sudo systemctl disable firewalld.service //"开机自启"禁用
sudo systemctl stop firewalld.service //停止防火墙
[centos7]防火墙命令
systemctl enable firewalld.service //"开机启动"启用 sudo systemctl disable firewalld.service //"开机自启"禁用 sudo systemctl start firewalld.service //启动防火墙 sudo systemctl stop firewalld.service //停止防火墙 sudo systemctl status firewalld.service //查看防火墙状态
4:下载安装jdk
需要把这些软件上传或者下载到centos当中,可以使用下载命令,也可以通过
winscp软件
(1)卸载原有jdk
centos7,自带Jdk。可以卸载,也可以不卸载。
rpm -qa | grep java rpm -e --nodeps java-1.8.0-openjdk sudo rpm -e --nodeps java-1.8.0-openjdk sudo rpm -e --nodeps java-1.8.0-openjdk-headless sudo rpm -e --nodeps java-1.7.0-openjdk sudo rpm -e --nodeps java-1.7.0-openjdk-headless [hadoop@localhost local]$ sudo rpm -e --nodeps java-1. java-1.7.0-openjdk java-1.8.0-openjdk-headless java-1.7.0-openjdk-headless [hadoop@localhost local]$ sudo rpm -e --nodeps java-1. java-1.7.0-openjdk java-1.8.0-openjdk-headless java-1.7.0-openjdk-headless [hadoop@localhost local]$ sudo rpm -e --nodeps java-1.8.0-openjdk-headless [hadoop@localhost local]$ rpm -qa | grep java javapackages-tools-3.4.1-11.el7.noarch tzdata-java-2015g-1.el7.noarch java-1.7.0-openjdk-1.7.0.91-2.6.2.3.el7.x86_64 java-1.7.0-openjdk-headless-1.7.0.91-2.6.2.3.el7.x86_64 python-javapackages-3.4.1-11.el7.noarch [hadoop@localhost local]$ java -version java version "1.7.0_91" OpenJDK Runtime Environment (rhel-2.6.2.3.el7-x86_64 u91-b00) OpenJDK 64-Bit Server VM (build 24.91-b01, mixed mode) [hadoop@localhost local]$ sudo rpm -e --nodeps java-1.7.0-openjdk java-1.7.0-openjdk java-1.7.0-openjdk-headless [hadoop@localhost local]$ sudo rpm -e --nodeps java-1.7.0-openjdk java-1.7.0-openjdk java-1.7.0-openjdk-headless [hadoop@localhost local]$ sudo rpm -e --nodeps java-1.7.0-openjdk [hadoop@localhost local]$ sudo rpm -e --nodeps java-1.7.0-openjdk-headless [hadoop@localhost local]$ java -version bash: java: 未找到命令...
(2)安装jdk
tar -zxf hadoop-2.7.3.tar.gz -C /usr/local/
sudo chown -R hadoop:hadoop /usr/local/hadoop-2.7.3
有三个环境变量配置文件分别是
/etc/profile;
~/.bashrc;
~.bash_profile
这三个文件,各有用处,我们配置其中一个。
vim ./.bashrc
export JAVA_HOME=/usr/local/jdk1.8.0_151 export JRE_HOME=$JAVA_HOME/jre export CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
5:下载安装hadoop
(1)用户的使用
如果安装在/usr/local文件夹下,就需要使用root用户。因为这个文件夹属于root的。
如果安装在本用户的home目录下,就不需要切换到root用户。
tar -xzf hadoop -C /usr/local
hadoop使用root安装,或者非root安装都无所谓,
重要的是配好用户的环境变量就好
(2)环境变量配置
如果想要配置全局环境变量,就用root用户下
vim /etc/profile
如果只是用户自己的环境变量,就不用切换root用户
直接 vim ./.bashrc
就行
环境变量的内容参考如下
export JAVA_HOME=/usr/local/jdk1.8.0_151 export JRE_HOME=$JAVA_HOME/jre export CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin export HADOOP_HOME=/usr/local/hadoop-2.7.3 export HADOOP_INSTALL=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
(3)安装成功
配置完成标志,下面几个命令都可以正常显示
echo $JAVA_HOME
java -version
echo $HADOOP_HOME
hadoop version
(4)如果配置个人用户的环境变量
要把两个安装文件改成个人用户所有。否则,个人用户没有启动权限。
chown -R hadoop /usr/local/hadoop
chown -R hadoop /usr/local/hadoop-2.7.3/
6:配置hadoop
{HADOOP_HOME}/etc/hadoop/目录下
(1)配置core-site.xml
[hadoop@slave204 hadoop]$ vim core-site.xml fs.default.name hdfs://master201:9000 指定HDFS的默认名称 fs.defaultFS hdfs://master201:9000 HDFS的URI hadoop.tmp.dir /usr/local/hadoop/tmp 节点上本地的hadoop临时文件夹 (2)配置hdfs-site.xml [hadoop@slave204 hadoop]$ vim hdfs-site.xml dfs.namenode.name.dir file:/usr/local/hadoop/hdfs/name namenode上存储hdfs名字空间元数据 dfs.datanode.data.dir file:/usr/local/hadoop/hdfs/data datanode上数据块的物理存储位置 dfs.replication 3 副本个数,默认是3,应小于datanode机器数量 (3)配置yarn-site.xml [hadoop@slave204 hadoop]$ vim yarn-site.xml yarn.resourcemanager.hostname master201 指定resourcemanager所在的hostname yarn.nodemanager.aux-services mapreduce_shuffle NodeManager上运行的附属服务。 需配置成mapreduce_shuffle,才可运行MapReduce程序 (4)配置mapred-site.xml [hadoop@slave204 hadoop]$ vim mapred-site.xml mapreduce.framework.name yarn 指定mapreduce使用yarn框架 (5)配置slaves [hadoop@slave204 hadoop]$ vim slaves slave202 slave203 slave204
6:拷贝副本
文件复制到另外的机器上。完成了个centos的jdk和hadoop安装,就可以把安装的内容复制多份。
(1)直接使用VMWare的复制虚拟机
(2)自己新建centos虚拟机,然后复制文件
scp -r hadoop-2.7.3 root@slave202:/usr/local
输入密码
scp -r /usr/local/jdk1.8.0_151/ root@slave204:/usr/local
等
复制完之后,记得改回用户属性
7:配置ssh免密登录
配置ssh免密登录,是指master和每台slave之间ssh免密登录,slave之间也需要免密登录。
如果是root用户,就要在root用户下进行配置。如果非root用户,那就在那个用户(如hadoop)下配置。
(1)修改 /etc/hostname文件
vim /etc/hostname
改成对应的文章开头的hostname
(2)个人用户免密登录本机localhost四步:
ssh-keygen -t rsa
cd .ssh
cat id_rsa.pub >> authorized_keys
chmod 600 authorized_keys
出现问题:
ssh localhost
主要是不能够有写的权限,至于执行的权限,就不清楚了
遇到问题:
Agent admitted failure to sign using the key
解決方式 使用 ssh-add 指令将私钥 加进来 (根据个人的密匙命
名不同更改 id_rsa)
# ssh-add ~/.ssh/id_rsa
(3)不同机器之间免密登录
scp id_rsa.pub hadoop@master201:/home/hadoop/.ssh/id_rsa.pub.slave202
cat id_rsa.pub.slave202 >> authorized_key
或者
ssh-copy-id servername
来回要
//不ssh localhost情况下会怎样,也可以成功
scp id_rsa.pub hadoop@master201:/home/hadoop/.ssh/id_rsa.pub.slave203
cat id_rsa.pub.slave203>> authorized_keys
(4)任何机器免密登录两步骤(简化之前的命令)
ssh-keygen -t -rsa
ssh-copy-id master201
ssh-copy-id slave204
8:启动hadoop
(1)格式化namenode
hdfs namenode -format
(2)启动hdfs
start-dfs.sh [hadoop@master201 local]$ start-dfs.sh 18/06/04 01:44:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [master201] master201: starting namenode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-namenode-master201.out slave204: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-slave204.out slave203: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-slave203.out slave202: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-slave202.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-secondarynamenode-master201.out 18/06/04 01:44:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
(3)启动yarn
start-yarn.sh [hadoop@master201 local]$ start-yarn.sh starting yarn daemons starting resourcemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-resourcemanager-master201.out slave202: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-slave202.out slave203: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-slave203.out slave204: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-slave204.out
(3)启动historyserver
[hadoop@master201 local]$ mr-jobhistory-daemon.sh start historyserver starting historyserver, logging to /usr/local/hadoop-2.7.3/logs/mapred-hadoop-historyserver-master201.out
9:查看hadoop相关信息
(1)hdfs后台管理页面
http://192.168.93.201:50070/dfshealth.html#tab-overview
(2)yarn后台管理页面
http://192.168.93.201:8088/cluster
(3)jps进程查看
(3.1)主节点的进程
[hadoop@master201 local]$ jps 19616 SecondaryNameNode 20341 ResourceManager 20600 Jps 19150 NameNode
有historyserve
[hadoop@master201 local]$ jps 19616 SecondaryNameNode 20705 JobHistoryServer 20341 ResourceManager 20744 Jps 19150 NameNode
(3.2)从节点进程
[root@slave204 hadoop]# jps 34148 NodeManager 33643 DataNode 34477 Jps
(3)hdfs进程报告
hdfs dfsadmin -report
10:附环境变量配置总的变化
export JAVA_HOME=/usr/local/jdk1.8.0_151 export JRE_HOME=$JAVA_HOME/jre export CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin export HADOOP_HOME=/usr/local/hadoop-2.7.3 export HADOOP_INSTALL=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
转载请注明:沐雨语曦 » hadoop完全分布式搭建