Hive Install in Docker
Hive Install in Docker
hive๋ฅผ ์ค์นํด๋ณด์. Hive๋ ์์์ ์ค์นํ๋ Hadoop ์ปจํ ์ด๋์ ํจ๊ป ์ค์นํ ๊ฒ์ด๋ค.
Hadoop ์ค์น๋ ์๋ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ๊ธฐ ๋ฐ๋๋ค.
Hadoop3 SingleNode Install in Docker
metastore DB
์ฐ์ metestore DB๋ฅผ ์ํ postgres ์ปจํ ์ด๋๋ฅผ ํ๋ ์์ฑํด์ผ ํ๋ค. postgres์ hadoop ์ปจํ ์ด๋์ ํต์ ์ ์ํด ์๋์ ๊ฐ์ด docker-compose.yml ํ์ผ์ ์์ฑํด ์ฃผ์๋ค.
๊ฐ๊ฐ์ ์ฌ์ฉ๋ ์ด๋ฏธ์ง๋ postgresql 11 ๊ณต์ ์ด๋ฏธ์ง์ ์ ์ ์์ฑํ๋ hadoop ์ด๋ฏธ์ง๋ฅผ ์ฌ์ฉํ๋ค.
# docker-compose.yml
version: '3'
services:
postgres:
image: kt1201/postgres
container_name: postgres
ports:
- "5432:5432"
environment:
POSTGRES_PASSWORD=postgres
volumes:
- postgres-data:/var/lib/postgresql/data
hdfs:
image: kt1201/hadoop:3.3.1
container_name: hadoop
ports:
- 50070:50070
- 10000:10000
volumes:
- hadoop_namenode:/root/hadoop-3.3.1/namenode
- hadoop_datanode:/root/hadoop-3.3.1/datanode
volumes:
postgres-data:
hadoop_namenode:
hadoop_datanode:
ํ์ ์๋ ๋ช ๋ น์ด๋ก ์ปจํ ์ด๋๋ฅผ ์คํ์์ผ์ค๋ค.
[root@kt1201 bigdata-docker-compose]# docker-compose up -d
postgres ์ปจํ ์ด๋๋ก ๋ค์ด๊ฐ postgres ๊ณ์ ์ผ๋ก psql๋ก ์ ์ํ ๋ค, hive ์ ์ ์ ๋น๋ฐ๋ฒํธ๋ฅผ ์ค์ ํด ์ค๋ค.
[root@kt1201 bigdata-docker-compose]# docker exec -it 25ddd9f892a4 /bin/bash
root@25ddd9f892a4:/# su - postgres
postgres@25ddd9f892a4:~$ psql
psql (14.2 (Debian 14.2-1.pgdg110+1))
Type "help" for help.
postgres=# create user hive superuser;
CREATE ROLE
postgres=# ALTER USER hive WITH PASSWORD 'hive';
ALTER ROLE
metastore ๋ผ๋ database ์์ฑ ๋ฐ hive ์ ์ ์๊ฒ ๊ถํ์ ๋ถ์ฌํ๊ณ โUTF-8โ ํ์ ์ ์ง์ ํ๋ค.
postgres=# CREATE DATABASE metastore WITH OWNER hive ENCODING 'UTF8' template template0;
CREATE DATABASE
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+------------+------------+-----------------------
metastore | hive | UTF8 | en_US.utf8 | en_US.utf8 |
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
(4 rows)
postgres=# \quit
postgres@25ddd9f892a4:~$
Hive
hive ์ค์นํ์ผ ๋ค์ด๋ก๋ ํ ์์ถํด์ ํ๋ค.
[root@b171915f8f63 ~]# wget 'https://mirror.navercorp.com/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz'
[root@b171915f8f63 ~]# tar zxvf apache-hive-3.1.2-bin.tar.gz
Hive ๊ด๋ จ ํ๊ฒฝ ๋ณ์ ๋ฐ PATH๋ฅผ ์ค์ ํด์ค๋ค.
# ~/.bashrc
# JAVA
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64
export JAVA_OPTS="-Dfile.encoding=UTF-8"
export CLASSPATH="."
# HADOOP
export HADOOP_HOME=/root/hadoop-3.3.1
export HADOOP_CONFIG_HOME=$HADOOP_HOME/etc/hadoop
# Hive
export HIVE_HOME=/root/apache-hive-3.1.2-bin
# PATH
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
postgresql jdbc ๋๋ผ์ด๋ฒ ๋ค์ด๋ก๋ ํ hive ๋ผ์ด๋ธ๋ฌ๋ฆฌ ๊ฒฝ๋ก์ ์ถ๊ฐํ๋ค.
[root@b171915f8f63 ~]# wget https://repo1.maven.org/maven2/org/postgresql/postgresql/42.2.20/postgresql-42.2.20.jar
[root@b171915f8f63 ~]# mv postgresql-42.2.20.jar $HIVE_HOME/lib
hive-site.xml
apache-hive-3.1.2-bin/conf/hive-site.xml ์ค์ ์ ํด์ค๋ค.
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:postgresql://postgres:5432/metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.postgresql.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>NONE</value>
</property>
</configuration>
ํ์ hive ๋๋ ํฐ๋ฆฌ ์์ฑ ๋ฐ ์ฐ๊ธฐ ๊ถํ์ ๋ถ์ฌํด์ค๋ค.
[root@b171915f8f63 conf]# hadoop fs -mkdir -p /user/hive/warehouse
[root@b171915f8f63 conf]# hadoop fs -ls /user/hive
Found 1 items
drwxr-xr-x - root supergroup 0 2022-04-20 08:07 /user/hive/warehouse
[root@b171915f8f63 conf]# hadoop fs -chmod g+w /user/hive/warehouse
[root@b171915f8f63 conf]# hdfs dfs -ls /user/hive
Found 1 items
drwxrwxr-x - root supergroup 0 2022-04-20 08:07 /user/hive/warehouse
metastore ์ด๊ธฐํ ํ hadoop๊ตฌ๋ ๋ฐ hive ์ ์
[root@b171915f8f63 conf]# schematool -initSchema -dbType postgres
[root@b171915f8f63 conf]# start-all.sh
[root@b171915f8f63 conf]# hive
Beeline ์ ์(์ธ๋ถ์ ์ ์ค์ )
hive ๋ง๊ณ , beeline๊ณผ ๊ฐ์ ์ธ๋ถ ํด๋ผ์ด์ธํธ๋ก ์ ์์ ํ๊ธฐ ์ํด์ hiveserver2๋ฅผ ๊ตฌ๋ํด์ผ ํ๋ค.
์๋๋ hiveserver2๋ฅผ ๊ตฌ๋/์ข ๋ฃํ๋ script์ด๋ค.
# /root/apache-hive-3.1.2-bin/bin/start-hive.sh
#!/bin/bash
nohup hive --service metastore > /dev/null 2>&1 &
nohup hive --service hiveserver2 > /dev/null 2>&1 &
# /root/apache-hive-3.1.2-bin/bin/stop-hive.sh
#!/bin/bash
PID=`ps -eaf | grep hiveserver2 | grep -v grep | awk '{print $2}'`
if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
fi
PID=`ps -eaf | grep metastore | grep -v grep | awk '{print $2}'`
if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
fi
#kill -9 $(lsof -t -i:10000)
hiveserver2๋ฅผ ๊ตฌ๋์ํค๊ณ beeline์ผ๋ก ์ ์ํด๋ณด์.
[root@b171915f8f63 bin]# start-hive.sh
[root@3f1e571ef5ad logs]# beeline -u 'jdbc:hive2://'
...
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.2 by Apache Hive
0: jdbc:hive2://>
์์ ๊ฐ์ด beeline์ผ๋ก ์ ์์ด ๋ ๊ฒ์ด๋ค. ํ์ง๋ง ์ฐ๋ฆฌ๋ ์ธ๋ถ ํด๋ผ์ด์ธํธ์์ ์ ์ํ ๋ ์๋์ ๊ฐ์ด ip์ port๋ฅผ ๋ช ์ํด ์ฃผ์ด์ผ ํ๋ค. ๋ช ์ํ๊ณ beeline ์ ์์ ์๋ํ๊ฒ ๋๋ฉด ์๋์ ๊ฐ์ด ๊ณ์ ๊ด๋ จ ์๋ฌ๊ฐ ๋์จ๋ค.
[root@3f1e571ef5ad conf]# beeline -n hive -p hive -u jdbc:hive2://localhost:10000
...
Connecting to jdbc:hive2://localhost:10000/
22/04/22 06:48:39 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000/: Failed to open y.authorize.AuthorizationException): User: root is not allowed to impersonate anonymous (state=08S01
Beeline version 3.1.2 by Apache Hive
hadoop ์ค์ ํ์ผ ์ค core-site.xml์ ๊ณ์ ๊ณผ ๊ทธ๋ฃน์ ๋ํ ์ ๊ทผ ์ค์ ์ ์ถ๊ฐํ๋ค.
<property>
<name>hadoop.proxyuser.{์ฌ์ฉ ๊ณ์ ๋ช
}.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.{์ฌ์ฉ ๊ณ์ ๋ช
}.groups</name>
<value>*</value>
</property>
ํ์ hiveserver2์ hadoop์ ์ฌ๊ตฌ๋ ํ beeline ์ ์์ ํ๋ค.
[root@b171915f8f63 ~]# stop-hive.sh
[root@b171915f8f63 ~]# stop-all.sh
[root@b171915f8f63 conf]# beeline -n hive -p hive -u jdbc:hive2://localhost:10000
DBeaver๋ก๋ ์ ์ ์ ์ ๋ ๊ฒ์ด๋ค.
Tez ์ค์น
ํ์ฌ hive engine์ ํ์ธ์ ์๋์ ๊ฐ๋ค.
0: jdbc:hive2://localhost:10000/> set hive.execution.engine;
+----------------------------+
| set |
+----------------------------+
| hive.execution.engine=mr |
+----------------------------+
1 row selected (0.026 seconds)
ํ์ฌ ์์ง์ MapReduce๋ก ๋์ด์๋ค. ์ด๋ฅผ Tez ์์ง์ผ๋ก ๋ฐ๊ฟ๋ณด์. ์ค์นํ Tez ๋ฒ์ ์ 0.9.2 ๋ฒ์ ์ด๋ค. ์ฐ์ ์๋ ๋ช ๋ น์ด๋ก ๋ค์ด๋ฐ๊ณ ์์ถ์ ํ์ด์ค๋ค.
[root@b171915f8f63 ~]# wget https://downloads.apache.org/tez/0.9.2/apache-tez-0.9.2-bin.tar.gz
[root@b171915f8f63 ~]# tar -zxvf apache-tez-0.9.2-bin.tar.gz
hive-site.xml ์ค์ ์ถ๊ฐ
hive-site.xml์ ์๋ ์ค์ ์ ํด์ค๋ค.
# hive-site.xml
<configuration>
...
<property>
<name>tez.lib.uris</name>
<value>/root/apache-tez-0.9.2-bin</value>
</property>
<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
</configuration>
tez-site.xml
tez-site.xml์ ์๋ก ๋ง๋ค์ด์ ์๋์ ๊ฐ์ด ์์ฑํ๋ค. ์ด๋ tez.lib.uris๋ hdfs์์ tez ๋ผ์ด๋ธ๋ฌ๋ฆฌ ์์น์ด๋ค. ์์ฑ ํ์ ๋ฃ์ด์ค ๊ฒ์ด๋ค.
# /root/apache-tez-0.9.2-bin/conf
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/user/tez/tez.tar.gz</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>true</value>
</property>
<property>
<name>hive.tez.container.size</name>
<value>3020</value>
</property>
</configuration>
๋ค์์ ํ๊ฒฝ๋ณ์ ์ค์ ์ ์ถ๊ฐํด ์ค๋ค.
# ~/.bashrc
#Tez
export TEZ_HOME=/root/apache-tez-0.9.2-bin
export TEZ_CONF_DIR=$TEZ_HOME/conf
export TEZ_JARS=$TEZ_HOME/*:$TEZ_HOME/lib/*
export HADOOP_CLASSPATH=$CLASSPATH:${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*
ํ๊ฒฝ๋ณ์ ์ค์ ์ ์ ์ฉํ๋ค.
[root@b171915f8f63 ~]# source ~/.bashrc
์ด์ tez ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ tez-site.xml์ ์ค์ ํ hdfs ๊ฒฝ๋ก๋ก ๋ฃ์ด์ค๋ค.
[root@b171915f8f63 ~]# hdfs dfs -mkdir -p /user/tez
[root@b171915f8f63 ~]# hdfs dfs -put /root/apache-tez-0.9.2-bin/share/tez.tar.gz /user/tez
์ด์ hive๋ฅผ ์คํํ๊ณ ์์ง์ ํ์ธํ๋ค.
[root@b171915f8f63 conf]# hive
hive> set hive.execution.engine;
hive.execution.engine=tez
ํด๋น ์ปจํ ์ด๋๋ฅผ commitํด ์ค๋ค.
[root@b171915f8f63 conf]# exit
[root@kt1201 ~]# docker commit hadoop kt1201/hadoop:3.3.1
Leave a comment