Hadoop3 SingleNode Install in Docker
๋์ปค์ hadoop3์ ์ค์นํ ์ผ์ด ์๊ฒจ ์ ๋ฆฌํ๋ค.
์คํ ํ๊ฒฝ
ํ์๋ Docker๋ฅผ ์ด์ฉํ ์ปจํ ์ด๋ ํ๊ฒฝ์์ Single Node๋ก ์ค์นํ ๊ฒ์ด๋ค.
์ฌ์ฉํ OS๋ centos7.9์ด๋ค. ์๋ ํญ๋ชฉ์ด ์ ์ ๊ฐ ๋์ด ์์ด์ผ ํ๋ค.
- java 1.8 (JAVA_HOME๊ณผ PATH ํ๊ฒฝ์ค์ ๊น์ง ์๋ฃ) ์ค์น
- ๋ก์ปฌ์ docker-compose ์ค์น
Docker์ ์ค์น ์ ์ฃผ์์ฌํญ
Docker์ ์ค์น ์ ๊ทธ๋ฅ centos ์ปจํ ์ด๋๋ฅผ ์์ฑํ์ฌ systemctl ๋ช ๋ น์ ์ฌ์ฉํ๋ฉด ์๋์ ๊ฐ์ด ๋์จ๋ค.
[root@52ef2bb43881 ~]# systemctl
Failed to get D-Bus connection: Operation not permitted
systemctl ๋ช ๋ น์ ์ฌ์ฉํ๊ธฐ ์ํด์ ์ฒ์ ์ปจํ ์ด๋๋ฅผ ์คํ ํ ๋ ์๋์ ๊ฐ์ด โprivileged ์ต์ ๊ณผ -d ์ต์ ์ผ๋ก /sbin/init์ ์คํํํ exec๋ก /bin/bash๋ฅผ ์คํ์์ผ์ผ ํ๋ค
$ docker run --privileged -d --name mycentos centos:7 /sbin/init
$ docker exec -it mycentos /bin/bash
Hadoop ์ค์น
ํ๋ก์ ์ค์นํ๋ค. ์ค์น ๋ฒ์ ์ 3.3.1์ด๋ค.
[root@52ef2bb43881 ~]# wget https://mirrors.sonic.net/apache/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
[root@52ef2bb43881 ~]# tar xvzf hadoop-3.3.1.tar.gz
ํ๋ก์ ๋ํ ํ๊ฒฝ๋ณ์ ์ค์ ์ ~/.bashrc
์ ์๋์ ๊ฐ์ด ํด์ค๋ค. HADOOP_HOME์ ๊ฐ์์ ๋ง๋ ๊ฒฝ๋ก๋ฅผ ์จ์ฃผ๋ฉด ๋๋ค.
export HADOOP_HOME=/root/hadoop-3.3.1
export HADOOP_CONFIG_HOME=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
์ดํ ํ๋ก ๋ฒ์ ํ์ธ์ ํตํด ํ๊ฒฝ๋ณ์ ๋ฐ ์ค์น ํ์ธ์ ํ๋ค
[root@52ef2bb43881 ~]# source ~/.bashrc
[root@52ef2bb43881 ~]# hadoop version
Hadoop 3.3.1
Source code repository https://github.com/apache/hadoop.git -r a3b9c37a397ad4188041dd80621bdeefc46885f2
Compiled by ubuntu on 2021-06-15T05:13Z
Compiled with protoc 3.7.1
From source with checksum 88a4ddb2299aca054416d6b7f81ca55
This command was run using /home/hadoop-3.3.1/share/hadoop/common/hadoop-common-3.3.1.jar
์ง๊ธ๊น์ง standalone hadoop ์ค์น์๋ค. ์ฌ๊ธฐ์ ์ด์ด์ Single Node Cluster๋ก ์ค์น๋ฅผ ํด๋ณด์.
ํ์ ํจํค์ง ์ค์น
ํ๋ก์์๋ ๋ ธ๋ ๊ฐ์ ํต์ ์ ssh๋ฅผ ์ด์ฉํ์ฌ ํ๊ธฐ ๋๋ฌธ์ ssh ์ค์น๊ฐ ํ์์ด๋ค.
[root@52ef2bb43881 ~]# yum install openssh-server openssh-clients openssh-askpass -y
SSH Keygen
ssh๋ฅผ ๋น๋ฐ๋ฒํธ ์์ด ํต์ ํ๊ธฐ ์ํด ssh-keygen์ ํด์ค๋ค.
[root@52ef2bb43881 ~]# ssh-keygen -t rsa
[root@52ef2bb43881 ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[root@52ef2bb43881 ~]# systemctl start sshd
[root@52ef2bb43881 ~]# ssh localhost
Last login: Tue Apr 19 08:41:05 2022 from localhost
[root@52ef2bb43881 ~]# exit
logout
Connection to localhost closed.
[root@52ef2bb43881 ~]#
Hadoop ์ค์
hadoop-env.sh ํ์ผ์ ์ด์ด ์๋์ ๊ฐ์ด ์์ฑํด์ค๋ค.
[root@52ef2bb43881 ~]# cd $HADOOP_CONFIG_HOME
[root@52ef2bb43881 hadoop]# vim hadoop-env.sh
# hadoop-env.sh
...
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"
๊ทธ๋ฆฌ๊ณ ๊ฐ ๋ฐ๋ชฌ๋ค์ด ํ์ผ๋ก ์ฌ์ฉํ ๋๋ ํ ๋ฆฌ๋ฅผ ์์ฑํ๋ค.
[root@52ef2bb43881 ~]# mkdir $HADOOP_HOME/temp
[root@52ef2bb43881 ~]# mkdir $HADOOP_HOME/namenode
[root@52ef2bb43881 ~]# mkdir $HADOOP_HOME/datanode
์ดํ ๋ค์ ๊ฐ ์ค์ ํ์ผ๋ค์ ์์ ํด ์ค๋ค. ์ค์ ํ์ผ๋ค์ด ์๋ ๋๋ ํ ๋ฆฌ ๊ฒฝ๋ก๋ก ์ด๋ํ๋ค.
[root@52ef2bb43881 ~]# cd $HADOOP_CONFIG_HOME
core-site.xml
HDFS์ MapReduce์์ ๊ณตํต์ ์ผ๋ก ์ฌ์ฉํ ํ๊ฒฝ์ ๋ณด
[root@52ef2bb43881 hadoop]# vim core-site.xml
# core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/hadoop-3.3.1/temp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<final>true</final>
</property>
<!-- Trash -->
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<property>
<name>fs.trash.checkpoint.interval</name>
<value>120</value>
</property>
</configuration>
hdfs-site.xml
HDFS์์ ์ฌ์ฉํ ํ๊ฒฝ์ ๋ณด
[root@52ef2bb43881 hadoop]# vim hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/root/hadoop-3.3.1/namenode</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/root/hadoop-3.3.1/datanode</value>
<final>true</final>
</property>
</configuration>
mapred-site.xml
MapReduce์์ ์ฌ์ฉํ ํ๊ฒฝ ์ ๋ณด
[root@52ef2bb43881 hadoop]# vim mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
Hadoop ์คํ
๋จผ์ ๋ค์๋ ธ๋๋ฅผ ํฌ๋งทํด์ค๋ค.
[root@52ef2bb43881 hadoop]# hadoop namenode -format
ํ๋ก ์ค์น๊ฐ ๋๋ฌ๋ค. docker commit ํ์ฌ ์ด๋ฏธ์ง๋ฅผ ์์ฑํด์ค๋ค.
[root@kt1201 ~]# docker commit 52ef2bb43881 kt1201/hadoop:3.3.1
๋ค์ ์ปจํ ์ด๋๋ก ๋์์์ ํ๋ก์ ์คํ์ํค๊ณ , jps๋ก ํ์ธํด๋ณด์.
[root@52ef2bb43881 hadoop]# start-all.sh
Starting namenodes on [localhost]
Last login: Wed Apr 20 02:12:34 UTC 2022 from localhost on pts/1
Starting datanodes
Last login: Wed Apr 20 03:56:37 UTC 2022 on pts/0
Starting secondary namenodes [52ef2bb43881]
Last login: Wed Apr 20 03:56:39 UTC 2022 on pts/0
52ef2bb43881: Warning: Permanently added '52ef2bb43881,172.17.0.2' (ECDSA) to the list of known hosts.
Starting resourcemanager
Last login: Wed Apr 20 03:56:42 UTC 2022 on pts/0
Starting nodemanagers
Last login: Wed Apr 20 03:56:46 UTC 2022 on pts/0
[root@52ef2bb43881 hadoop]#
[root@52ef2bb43881 hadoop]#
[root@52ef2bb43881 hadoop]# jps
2145 Jps
1369 SecondaryNameNode
1145 DataNode
971 NameNode
1819 NodeManager
1661 ResourceManager
ํ๋ก ์ค์น๊ฐ ์๋ฃ๋์๋ค. ์ดํ ๋ค๋ฅธ Ecosystem๋ค๊ณผ์ ํต์ ์ ์ํ์ฌ ๋ฏธ๋ฆฌ docker-compose.yml์ ์๋์ ๊ฐ์ด ์์ฑํด ์ฃผ์๋ค.
# docker-compose.yml
version: '3'
services:
hdfs:
image: kt1201/hadoop:3.3.1
container_name: hadoop
ports:
- 50070:50070
volumes:
- hadoop_namenode:/root/hadoop-3.3.1/namenode
- hadoop_datanode:/root/hadoop-3.3.1/datanode
volumes:
hadoop_namenode:
hadoop_datanode:
์์์ 50070ํฌํธ์ ํฌํธํฌ์๋ฉ์ ํ๋ก์์ ์ ๊ณตํ๋ UI๋ฅผ ๋ณผ์ ์๋ ํฌํธ์ด๋ค. ํด๋น yml ํ์ผ๋ก ์ปจํ ์ด๋๋ฅผ ์์ฑํ๊ณ ํ๋ก์ ๊ตฌ๋ ์ํจ ํ, ๋ธ๋ผ์ฐ์ ์์ 50070ํฌํธ๋ก ๋ค์ด๊ฐ๋ฉด ์๋์ ๊ฐ์ ํ๋ฉด์ ๋ณผ ์ ์๋ค.
๋ค์์ฅ์ hive ์ค์น๋ฅผ ํด๋ณด์. Hive Install in Docker
Leave a comment