Public | Automated Build

Last pushed: 6 days ago
Short Description
Impala development environment
Full Description

Dockerfiles for an Impala development environment

Images

Several images are available but only two are inteded for use:

Complete

The complete image has Impala built and test data loaded. Because of the size and time required to build this image and limitations of dockerhub, the image is not available for download but can be built locally. The built image will be about 50 GB in size and take 1-4 hours depending on your system.

To build the complete image:

# Get the latest version of the "minimal" image. Only really needed
# if you already have a minimal image from before Apr 10, 2016. That
# image would be incompatible.
$ docker pull cloudera/impala-dev:minimal

# Build the image.
$ docker build complete

Using the complete image:

$ docker run -i -t cloudera/impala-dev:complete /bin/bash
[container]$ docker-boot   # starts Postgres and SSH both needed to run Impala
[container]$ . bin/impala-config.sh   # sets the Impala environment variables
[container]$ run-all.sh   # starts dependent services -- HDFS, Hive metastore, etc
[container]$ start-impala-cluster.py
[container]$ impala-shell.sh
[localhost:21000] > select count(*) from tpch.lineitem;

The image can also be run in the background and logged into over SSH:

(Note the "-d" and the lack of the trailing "/bin/bash".)

$ docker run -d -t cloudera/impala-dev:complete
<some hash>
$ docker inspect <some hash> | grep IPAddress
<output showing the IP address>
$ ssh dev@<IP address>   # password is cloudera
[container]$ . bin/impala-config.sh   # sets the Impala environment variables
[container]$ run-all.sh   # starts dependent services -- HDFS, Hive metastore, etc
[container]$ start-impala-cluster.py
[container]$ impala-shell.sh
[localhost:21000] > select count(*) from tpch.lineitem;

This works because the default command for the image is to run "docker-boot" which starts an SSH service.

Minimal

The minimal image has Impala built but the test data is not loaded. The image is about 5 GB and can be downloaded from dockerhub or built locally.

Using the minimal image:

$ docker run -i -t cloudera/impala-dev:minimal /bin/bash
[container]$ docker-boot   # starts Postgres and SSH both needed to run Impala
[container]$ cd Impala
[container]$ . bin/impala-config.sh   # sets the Impala environment variables
[container]$ ./buildall.sh -format -skiptests
[container]$ run-all.sh   # starts dependent services -- HDFS, Hive metastore, etc
[container]$ start-impala-cluster.py
[container]$ impala-shell.sh
[localhost:21000] > create database test;

This image can also be started in the background.

If you want to load the test data manually inside the minimal instance, see the necessary steps in complete/Dockerfile.

Other Images

The remainder of the images only exist as workarounds for the limitations of dockerhub. Compiling code on dockerhub is slow and builds have a 2 hour timeout. The build of the Minimal image needed to be split into several steps to avoid the timeout.

Prebuilt images are hosted by Dockerhub.

For more information see the Impala wiki or ask a question on the dev user group.

Docker Pull Command
Owner
cloudera
Source Repository

Comments (8)
niteshthali08
11 days ago

@gongrui365 were you able to resolve the error ?

niteshthali08
11 days ago

Facing same while building Impala as many people have reported below.

Error in /home/dev/Impala/testdata/bin/run-mini-dfs.sh at line 40: $IMPALA_HOME/testdata/cluster/admin start_cluster
Error in /home/dev/Impala/testdata/bin/run-all.sh at line 42: tee ${IMPALA_CLUSTER_LOGS_DIR}/run-mini-dfs.log
The command '/bin/sh -c docker-boot && . bin/impala-config.sh && mkdir -p $IMPALA_HOME/testdata/impala-data && pushd $IMPALA_HOME/testdata/impala-data && cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz && tar -xzf tpch.tar.gz && rm tpch.tar.gz && cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz && tar -xzf tpcds.tar.gz && rm tpcds.tar.gz && popd && ./buildall.sh -notests -noclean -format -testdata && sudo rm -rf $IMPALA_HOME/testdata/impala-data' returned a non-zero code: 1

lovetoken
11 days ago

Hello.
When try to pull cloudera/impala-dev:minimal image, not found error happen.
Below reporting is docker command & output log.

docker@boot2docker:~$ sudo docker pull cloudera/impala-dev:minimal
Pulling repository docker.io/cloudera/impala-dev
Tag minimal not found in repository docker.io/cloudera/impala-dev

I'm trying also latest tag. but not found too.

gongrui365
8 months ago

Hi
When trying to build complete image,some errors happen.

Starting kudu (Web UI - http://localhost:8051)
/home/dev/Impala/testdata/cluster/cdh5/node-3/etc/init.d/common: line 18: 6634 Aborted "$CMD" "$@" &> "$LOG_FILE"
Failed to start kudu-tserver. The end of the log (/home/dev/Impala/testdata/cluster/cdh5/node-3/var/log/kudu-tserver.out) is:
/home/dev/Impala/testdata/cluster/cdh5/node-2/etc/init.d/common: line 18: 6656 Aborted "$CMD" "$@" &> "$LOG_FILE"
Failed to start kudu-tserver. The end of the log (/home/dev/Impala/testdata/cluster/cdh5/node-2/var/log/kudu-tserver.out) is:
F0718 19:55:15.575865 6634 tablet_server_main.cc:55] Check failed: _s.ok() Bad status: Service unavailable: Cannot initialize clock: Error reading clock. Clock considered unsynchronized
Check failure stack trace:
@ 0x7cffed (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7cffec)
@ 0x7d1eed (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7d1eec)
@ 0x7cfb29 (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7cfb28)
@ 0x7d298f (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7d298e)
@ 0x77b37b (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x77b37a)
@ 0x7ff3ee0ccec5 (/var/lib/docker/aufs/diff/426a0cc6d7b020d5e908102c81d9b76a4ca6ae41842ba0a80afe6e2fe40652a4/lib/x86_64-linux-gnu/libc-2.19.so+0x21ec4)
@ 0x77ae36 (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x77ae35)
@ (nil) (unknown)
F0718 19:55:15.575858 6656 tablet_server_main.cc:55] Check failed: _s.ok() Bad status: Service unavailable: Cannot initialize clock: Error reading clock. Clock considered unsynchronized
Check failure stack trace:
@ 0x7cffed (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7cffec)
@ 0x7d1eed (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7d1eec)
@ 0x7cfb29 (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7cfb28)
@ 0x7d298f (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7d298e)
@ 0x77b37b (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x77b37a)
@ 0x7fece23ddec5 (/var/lib/docker/aufs/diff/426a0cc6d7b020d5e908102c81d9b76a4ca6ae41842ba0a80afe6e2fe40652a4/lib/x86_64-linux-gnu/libc-2.19.so+0x21ec4)
@ 0x77ae36 (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x77ae35)
@ (nil) (unknown)
/home/dev/Impala/testdata/cluster/cdh5/node-1/etc/init.d/common: line 18: 6675 Aborted "$CMD" "$@" &> "$LOG_FILE"
Failed to start kudu-master. The end of the log (/home/dev/Impala/testdata/cluster/cdh5/node-1/var/log/kudu-master.out) is:
F0718 19:55:15.523524 6675 master_main.cc:59] Check failed: _s.ok() Bad status: Service unavailable: Cannot initialize clock: Error reading clock. Clock considered unsynchronized
Check failure stack trace:
@ 0x7d63bd (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-master+0x7d63bc)
@ 0x7d82bd (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-master+0x7d82bc)
@ 0x7d5ef9 (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-master+0x7d5ef8)
@ 0x7d8d5f (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-master+0x7d8d5e)
@ 0x788d1b (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-master+0x788d1a)
@ 0x7f465241eec5 (/var/lib/docker/aufs/diff/426a0cc6d7b020d5e908102c81d9b76a4ca6ae41842ba0a80afe6e2fe40652a4/lib/x86_64-linux-gnu/libc-2.19.so+0x21ec4)
@ 0x7887d6 (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-master+0x7887d5)
@ (nil) (unknown)
/home/dev/Impala/testdata/cluster/cdh5/node-1/etc/init.d/common: line 18: 6684 Aborted "$CMD" "$@" &> "$LOG_FILE"
Failed to start kudu-tserver. The end of the log (/home/dev/Impala/testdata/cluster/cdh5/node-1/var/log/kudu-tserver.out) is:
F0718 19:55:15.576165 6684 tablet_server_main.cc:55] Check failed: _s.ok() Bad status: Service unavailable: Cannot initialize clock: Error reading clock. Clock considered unsynchronized
Check failure stack trace:
@ 0x7cffed (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7cffec)
@ 0x7d1eed (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7d1eec)
@ 0x7cfb29 (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7cfb28)
@ 0x7d298f (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x7d298e)
@ 0x77b37b (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x77b37a)
@ 0x7fb2c65a5ec5 (/var/lib/docker/aufs/diff/426a0cc6d7b020d5e908102c81d9b76a4ca6ae41842ba0a80afe6e2fe40652a4/lib/x86_64-linux-gnu/libc-2.19.so+0x21ec4)
@ 0x77ae36 (/var/lib/docker/aufs/diff/de1124ef9929e3ec0722b9120f684d9b43c6611e19f7e1d8674de217c98b1d40/home/dev/Impala/toolchain/kudu-0.8.0-RC1/release/bin/kudu-tserver+0x77ae35)
@ (nil) (unknown)
Error in /home/dev/Impala/testdata/bin/run-mini-dfs.sh at line 24: $IMPALA_HOME/testdata/cluster/admin start_cluster
Error in /home/dev/Impala/testdata/bin/run-all.sh at line 38: tee ${IMPALA_CLUSTER_LOGS_DIR}/run-mini-dfs.log
Error in ./buildall.sh at line 341: $IMPALA_HOME/testdata/bin/run-all.sh -format
The command '/bin/sh -c docker-boot && . bin/impala-config.sh && mkdir -p $IMPALA_HOME/testdata/impala-data && pushd $IMPALA_HOME/testdata/impala-data && cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz && tar -xzf tpch.tar.gz && rm tpch.tar.gz && cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz && tar -xzf tpcds.tar.gz && rm tpcds.tar.gz && popd && ./buildall.sh -notests -noclean -format -testdata && sudo rm -rf $IMPALA_HOME/testdata/impala-data' returned a non-zero code: 1

How should I do ?

nutthaphon
8 months ago

error after run
./buildall.sh -format -skiptests

[ 80%] Building CXX object be/src/exprs/CMakeFiles/expr-test.dir/expr-test.cc.o
g++: internal compiler error: Killed (program cc1plus)
0x40c94c../../gcc-4.9.2/gcc/gcc.c:2854
0x40cd14../../gcc-4.9.2/gcc/gcc.c:4658
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5941
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5855
0x40db89../../gcc-4.9.2/gcc/gcc.c:5312
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5941
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5855
0x40db89../../gcc-4.9.2/gcc/gcc.c:5312
0x40d8f3../../gcc-4.9.2/gcc/gcc.c:5427
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5941
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5855
0x40db89../../gcc-4.9.2/gcc/gcc.c:5312
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5941
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5855
0x40db89../../gcc-4.9.2/gcc/gcc.c:5312
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5941
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5855
0x40db89../../gcc-4.9.2/gcc/gcc.c:5312
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5941
0x40f5d6../../gcc-4.9.2/gcc/gcc.c:5855
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.
make[2]: [be/src/exprs/CMakeFiles/expr-test.dir/expr-test.cc.o] Error 4
make[1]:
[be/src/exprs/CMakeFiles/expr-test.dir/all] Error 2
make: * [all] Error 2
Error in /home/dev/Impala/bin/make_impala.sh at line 146: ${MAKE_CMD} -j${IMPALA_BUILD_THREADS:-4}
Error in ./buildall.sh at line 314: $IMPALA_HOME/bin/make_impala.sh ${MAKE_IMPALA_ARGS}

nutthaphon
9 months ago

How much space prepare for this extracting process?

4bd97af9c325: Pull complete
770bb900ce8a: Extracting [==================================================>] 512.3 MB/512.3 MB
1d1f372c0bda: Download complete
failed to register layer: Untar re-exec error: exit status 1: output: write /home/dev/Impala/be/src/runtime/CMakeFiles/Runtime.dir/disk-io-mgr.cc.o: no space left on device

caseyching
a year ago

Sorry about the late response. Did you ever get this to work? I'm guessing the problem has something to do with dockerhub. Maybe retrying would help.

vladif
a year ago

Hello,

When trying to download image :
docker pull cloudera/impala-dev:minimal getting stuck
===>

b08943edcb2c: Downloading [===========================> ] 679.6 MB/1.232 GB
7a99762128f5: Download complete
0028aa745d8e: Downloading [==============================> ] 302.2 MB/498.4 MB
f5fb02b83fa7: Downloading [===========================> ] 196.7 MB/354.4 MB