JobPlus知识库 IT 大数据 文章
Hadoop3.1.0完全分布式集群部署超详细记录

Hadoop3.1.0完全分布式集群部署,三台服务器部署结构如下github配置文件源码地址

#部署完成后

root@servera:/opt/hadoop/hadoop-3.1.0# jps

14056 SecondaryNameNode

14633 Jps

13706 NameNode

14317 ResourceManager

root@serverb:~# jps

5288 NodeManager

5162 DataNode

5421 Jps

root@serverc:~# jps

4545 NodeManager

4371 DataNode

4678 Jps


如上图,一共三台机器作为集群,servera作为master,其他两台作为worker。

2.开始部署-前期准备(三台机器都需要进行如下操作)

  • 2.1.配置hosts文件【三台】

vim /etc/hosts

10.80.80.110    servera

10.80.80.111    serverb

10.80.80.112    serverc


  • 2.2.jdk 安装【三台】

      查看 Javajava --version

    • 下载jdk

    wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u172-b11/a58eab1ec242421181065cdc37240b08/jdk-8u172-linux-x64.tar.gz


      • 解压
      • mkdir /opt/java
      • wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u172-b11/a58eab1ec242421181065cdc37240b08/jdk-8u172-linux-x64.tar.gz
      •  tar -zxf jdk-8u172-linux-x64.tar.gz
      • mv jdk1.8.0_172/ /opt/java/

    • 配置JAVA变量
    •  /vim /etc/profile.d/jdk-1.8.sh

    • #!/bin/sh
    • # Author:wangxiaolei 王小雷
    • # Blog: http://blog.csdn.net/dream_an
    • # Github: https://github.com/wangxiaoleiai
    • # web: www.xiaolei.wang
    • # Date: 2018.05
    • # Path: /etc/profile.d/
    • export JAVA_HOME=/opt/java/jdk1.8.0_172
    • export JRE_HOME=${JAVA_HOME}/jre
    • export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
    • export PATH=${JAVA_HOME}/bin:$PATH

    • # 使环境变量生效
    • source /etc/profile
    • # 查看
    •  Javajava --version

  • 2.3.pdsh、ssh安装【三台】

root@servera:~# apt install ssh pdsh


echo ssh>/etc/pdsh/rcmd_default


  • 2.4.免密码登录自身【三台】

$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

$ chmod 0600 ~/.ssh/authorized_keys

 ssh localhost(首次需输入yes)


  • 2.5.servera免密码登录其他机器(master免密码登录worker)【单台,只需在servera上执行】

ssh-copy-id -i ~/.ssh/id_rsa.pub servera

 ssh-copy-id -i ~/.ssh/id_rsa.pub serverb

 ssh-copy-id -i ~/.ssh/id_rsa.pub serverc


3.hadoop3+配置文件

共需要配置/opt/hadoop/hadoop-3.1.0/etc/hadoop/下的六个个文件,分别是

hadoop-env.sh、core-site.xml、hdfs-site.xml、yarn-site.xml、mapred-site.xml、workers

  • 3.1. hadoop-env.sh 添加如下内容

export JAVA_HOME=/opt/java/jdk1.8.0_172/

export HDFS_NAMENODE_USER="root"

export HDFS_DATANODE_USER="root"

export HDFS_SECONDARYNAMENODE_USER="root"

export YARN_RESOURCEMANAGER_USER="root"

export YARN_NODEMANAGER_USER="root"


  • 3.2. core-site.xml

<configuration>

  <!-- 指定hdfs的nameservice为ns1 -->

  <property>

      <name>fs.defaultFS</name>

      <value>hdfs://ruizhia:9000</value>

  </property>

  <property>

      <name>io.file.buffer.size</name>

      <value>131072</value>

  </property>

</configuration>


  • 3.3. hdfs-site.xml

<configuration>

<!-- Configurations for NameNode: -->

<property>

  <name>dfs.namenode.name.dir</name>

  <value>/var/lib/hadoop/hdfs/name/</value>

</property>

<property>

  <name>dfs.blocksize</name>

  <value>268435456</value>

</property>

<property>

  <name>dfs.namenode.handler.count  </name>

  <value>100</value>

</property>

<!-- Configurations for DataNode: -->

<property>

  <name>dfs.datanode.data.dir</name>

  <value>/var/lib/hadoop/hdfs/data/</value>

</property>

<property>

    <name>dfs.replication</name>

    <value>1</value>

</property>

</configuration>


  • 3.4. yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->

<!-- Configurations for ResourceManager and NodeManager: -->

<!-- Configurations for ResourceManager: -->

  <property>

          <name>yarn.resourcemanager.hostname</name>

          <value>servera</value>

  </property>

  <!-- 配置外网只需要替换外网ip为真实ip,否则默认为 localhost:8088 -->

  <!-- <property>

          <name>yarn.resourcemanager.webapp.address</name>

          <value>外网ip:8088</value>

  </property> -->

<!-- Configurations for NodeManager: -->

  <property>

          <name>yarn.nodemanager.aux-services</name>

          <value>mapreduce_shuffle</value>

  </property>

<!-- Configurations for History Server (Needs to be moved elsewhere): -->

</configuration>


  • 3.5. mapred-site.xml

<configuration>

  <!-- Configurations for MapReduce Applications: -->

  <property>

       <name>mapreduce.framework.name</name>

       <value>yarn</value>

   </property>

</configuration>


  • 3.6. workers

serverb

 serverc


4. 复制Hadoop文件到其他集群、配置Hadoop环境变量、格式化hdfs、开启集群、查看、关闭、重置集群

  • 4.1. 将步骤3配置好的hadoop文件复制到其他同样位置的机器上 
    /opt/hadoop/hadoop-3.1.0
  • 4.2 配置Hadoop环境变量【三台机器都操作】

vim /etc/profile.d/hadoop-3.1.0.sh


#!/bin/sh

# Author:wangxiaolei 王小雷

# Blog: http://blog.csdn.net/dream_an

# Github: https://github.com/wangxiaoleiai

# Date: 201805

# web: www.xiaolei.wang

# Path: /etc/profile.d/


export HADOOP_HOME="/opt/hadoop/hadoop-3.1.0"

export PATH="$HADOOP_HOME/bin:$PATH"

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop


source /etc/profile


  • 4.3. 格式化HDFS [只有首次部署才可使用]【谨慎操作,只在servera上操作】

/opt/hadoop/hadoop-3.1.0/bin/hdfs namenode -format myClusterName


  • 4.4. 开启 【只在servera上操作】

/opt/hadoop/hadoop-3.1.0/sbin/start-dfs.sh

 /opt/hadoop/hadoop-3.1.0/sbin/start-yarn.sh


  • 4.5. 查看 【三台】

jps


  • 4.6. web端localhost:8088查看【localhost只定servera的localhost,也可以换成外网ip,在详见步骤3.4. yarn-site.xml 】

  • 4.7. 关闭 【只在servera上操作】

/opt/hadoop/hadoop-3.1.0/sbin/stop-dfs.sh

 /opt/hadoop/hadoop-3.1.0/sbin/stop-yarn.sh


  • 4.8. 重置hadoop环境 [移除hadoop hdfs log文件] 【谨慎操作,只在servera上操作】

rm -rf /opt/hadoop/hadoop-3.1.0/logs/*

 rm -rf /var/lib/hadoop/


5.遇到的坑 pdsh@servera: servera: connect: Connection refused

root@servera:/opt/hadoop/hadoop-3.1.0# sbin/start-dfs.sh 
Starting namenodes on [servera] 
pdsh@servera: servera: connect: Connection refused 
Starting datanodes 
pdsh@servera: serverc: connect: Connection refused 
pdsh@servera: serverb: connect: Connection refused 
Starting secondary namenodes [servera] 
pdsh@servera: servera: connect: Connection refused

  • 解决方法步骤2.3中

echo ssh>/etc/pdsh/rcmd_default


如果觉得我的文章对您有用,请随意打赏。您的支持将鼓励我继续创作!

¥ 打赏支持
347人赞 举报
分享到
用户评价(0)

暂无评价,你也可以发布评价哦:)

扫码APP

扫描使用APP

扫码使用

扫描使用小程序