Public | Automated Build

Last pushed: 18 days ago
Short Description
Docker container for Apache Hive with hiveserver2
Full Description

docker-hive

This is a docker container for Apache Hive 2.1.1. It is based on https://github.com/big-data-europe/docker-hadoop so check there for Hadoop configurations.
This deploys Hive and starts a hiveserver2 on port 10000.
Metastore is running with a connection to postgresql database.
The hive configuration is performed with HIVE_SITECONF variables (see hadoop-hive.env for an example).

To build and run Hive with postgresql metastore:

    docker-compose build
    docker-compose up -d namenode hive-metastore-postgresql
    docker-compose up -d datanode hive-metastore
    docker-compose up -d hive-server

hive-metastore service depends on hive-metastore-postgresql, which should be up and running before you start hive-metastore.
hive-server service depends on hive-metastore service.

To run a PrestoDB 0.181 with Hive connector:

  docker-compose up -d presto-coordinator

This deploys a Presto server listens on port 8080

Testing

Load data into Hive:

  $ docker exec -it hive-server bash
  # /opt/hive/bin/beeline -u jdbc:hive2://localhost:10000
  > CREATE TABLE pokes (foo INT, bar STRING);
  > LOAD DATA LOCAL INPATH '/opt/hive/examples/files/kv1.txt' OVERWRITE INTO TABLE pokes;

Then query it from PrestoDB. You can get presto.jar from PrestoDB website:

  $ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.183/presto-cli-0.183-executable.jar
  $ mv presto-cli-0.183-executable.jar presto.jar
  $ chmod +x presto.jar
  $ ./presto.jar --server localhost:8080 --catalog hive --schema default
  presto> select * from pokes;

Contributors

Docker Pull Command
Owner
bde2020
Source Repository