segence/spark

By segence

Updated almost 4 years ago

Image
0

125

Apache Spark Docker

Docker images of Apache Spark.

Images

ImageDescriptionDockerfile
BaseSpark base image with default installation. Only Avro library is added on top of official installation.Dockerfile
CloudContains the Hadoop AWS library to access S3.Dockerfile.cloud
Base

Base image containing default installation.

Cloud

Connecting to AWS:

val df = spark.read.json("s3a://mybucket/sth.json")
val df = spark.read.format("avro").load("s3a://mybucket/sth.avro")

Docker Pull Command

docker pull segence/spark