This is an image for running Spark with Hive support built in. This image is based on
openjdk:8-alpine and is approximately 650 MB large. See build-spark.sh for the details on how Spark is built.
OpenSSL and the boto3 Python library are also installed.
$ docker run -it --rm makerstudios/spark-hive:2.0.1-hadoop-2.7.3 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). 16/10/24 18:06:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/10/24 18:06:47 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect. Spark context Web UI available at http://172.17.0.2:4040 Spark context available as 'sc' (master = local[*], app id = local-1477332407651). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.1 /_/ Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_92-internal) Type in expressions to have them evaluated. Type :help for more information. scala>
$ docker run -it --rm -v $(pwd)/script.py:/script.py makerstudios/spark-hive:2.0.1-hadoop-2.7.3 spark-submit /script.py ...