Public Repository

Last pushed: 4 months ago
Short Description
CentOS 7, Vagrant user, Puppet, Oracle JDK 1.8, Spark 2.1.1
Full Description

A base image for playing with Spark.
Built atop of bashtoni/centos7-vagrant image for easier Vagrant integration.

Contents:

  • CentOS 7 (inherited)
  • Puppet 4 (inherited)
  • vagrant user (inherited)
  • Oracle JDK 1.8.0_131
  • Spark 2.1.1

Dockerfile:

# A Docker container to use with Vagrant
# CentOS 7, Vagrant user, Pupper, Oracle JDK 1.8, Spark 2.1.1, Hadoop 2.7

FROM bashtoni/centos7-vagrant
MAINTAINER mkleymenov

WORKDIR /home/vagrant

# Install Oracle JDK 1.8
ENV JDK_URL http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm
ENV JDK_RPM jdk-8u131-linux-x64.rpm
RUN yum install -y wget tar &&\
      wget -nv --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" $JDK_URL  &&\
      rpm -Uvh $JDK_RPM &&\
      rm -f $JDK_RPM &&\
      yum clean all &&\
      yum -y install epel-release &&\
      java -version
ENV JAVA_HOME /usr/java/latest
ENV JRE_HOME $JAVA_HOME/jre
ENV PATH $JAVA_HOME/bin:$JRE_HOME/bin:$PATH

# Install Apache Spark 2.1.1 with Hadoop 2.7
ENV SPARK_URL https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.7.tgz

RUN curl -s ${SPARK_URL} | tar -xz -C /usr/local
RUN ln -s /usr/local/spark-2.1.1-bin-hadoop2.7 /usr/local/spark-latest

ENV SPARK_HOME /usr/local/spark-latest
ENV PATH $SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH

ENV SPARK_MASTER_OPTS="-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 \
    -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 \
    -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory"
ENV SPARK_WORKER_OPTS="-Dspark.driver.port=7001 -Dspark.fileserver.port=7002 -Dspark.broadcast.port=7003 \
    -Dspark.replClassServer.port=7004 -Dspark.blockManager.port=7005 -Dspark.executor.port=7006 -Dspark.ui.port=4040 \
    -Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory"

ENV SPARK_MASTER_PORT 7077
ENV SPARK_MASTER_WEBUI_PORT 8080
ENV SPARK_WORKER_PORT 8888
ENV SPARK_WORKER_WEBUI_PORT 8081

EXPOSE 8080 7077 8888 8081 4040 7001 7002 7003 7004 7005 7006
Docker Pull Command
Owner
mkleymenov

Comments (0)