Public Repository

Last pushed: a year ago
Short Description
Learn Spark by doing hands-on labs (Spark Fundamentals I course) on your laptop or Cloud
Full Description

Apache Spark Image for the Spark Fundamentals I course

This Docker image should be used for creating environments for conducting hands-on labs for the Spark Fundamentals I course on the www.bigdatauniversity.com. You can use it to create Spark environment on your own laptop/desktop or on one of the supported public clouds.

This docker image contained pre-deployed IBM STC Spark with Hadoop.

Set up Docker environment on your laptop

How to use this image ?

Kitematic (GUI)

  1. Start Kitematic in Docker folder

  2. type bigdatauniversity in the search box to filter the Docker Hub catalog to Big Data University provided images

  3. Click on Create button on the spark image to create Docker container using this image

Docker Quickstart Terminal (CLI)

For Mac
-- "Applications -> Docker -> Docker Quickstart Terminal"
For Windows
-- "Start -> Program -> Docker -> Docker Quickstart Terminal".

Then run the below steps within this terminal.

1) Pull (download) this Docker image
Run this command in your terminal window:

docker pull bigdatauniversity/spark
  • Note: it may take a while to pull this image over the internet

2) Start Docker container as daemon

  • Interactive
docker run -it --hostname bigdatauniversitySpark --name bdu_spark -P -p 8080:8080 -p 8081:8081 bigdatauniversity/spark:latest /etc/bootstrap.sh -bash
  • Daemon
docker run -d --hostname bigdatauniversitySpark --name bdu_spark -P -p 8080:8080 -p 8081:8081 bigdatauniversity/spark:latest /etc/bootstrap.sh -d

3) Start Spark

  • To start Scala Spark shell:
spark-shell
  • To start Python Spark shell:
pyspark

4) Note

  • All hands-on lab files are located in:
/opt/ibm/labfiles
  • How to restart and attach to the container

If you exit from Docker Container, you can always restart and attach to it later by running the below:

docker start  bdu_spark 
docker attach bdu_spark
  • Start a new command in a running container
docker exec -it bdu_spark <command>

Supported tags

  • latest
  • 1.4.0
  • 1.3.1

The supported tags stands for version of Spark.

Supported Docker versions

  • This image is officially supported on Docker version 1.6.0.
  • Support for older versions (down to 1.0) is provided on a best-effort basis.

Community Support

Like this image? Give us a star at the top of this page!

Docker Pull Command
Owner
bigdatauniversity

Comments (11)
prabu87
a year ago

not able to download/pull docker pull bigdatauniversity/spark

prabu87
a year ago

not able to download/pull docker pull bigdatauniversity/spark

rajeshsadhu
a year ago

Can anyone please share the solution if the below error is resolved:
Error response from daemon: Cannot start container 3f716996bec4bce53219a3db3558221ddfab2c08e042cb1140b0f9e397f40685: [8] System error: exec: "C:/Program Files/Git/etc/bootstrap.sh": stat C:/Program Files/Git/etc/bootstrap.sh: no such file or directory

qiaoying
a year ago

Hi,

I followed this installation guide. However, there is no return from pyspark command. Any suggestions? Thanks a lot.

sbedoll
a year ago

I have the same issue. Is this being looked into ? I am assuming bootstrap.sh is part of the docker image.

docker run -d --hostname bigdatauniversitySpark --name bdu_spark -P -p 8080:8080 -p 8081:8081 bigdatauniversity/spark:latest /etc/bootstrap.sh -d
abacb451b79002f668bf93c6654153bcbda8416b438d597d798e6c257dbc64c0
Error response from daemon: Cannot start container abacb451b79002f668bf93c6654153bcbda8416b438d597d798e6c257dbc64c0: [8] System error: exec: "C:/Program Files/Git/etc/bootstrap.sh": stat C:/Program Files/Git/etc/bootstrap.sh: no such file or directory

bittur
a year ago

Hi,
I am new to docker
i have an issue
$ docker run -d --hostname bigdatauniversitySpark --name bdu_spark -P -p 8080:8080 -p 8081:8081 bigdatauniversity/spark:latest /etc/bootstrap.sh -d
56956e88c688d1587f651daa171a424976f7e303aeceae202fec9d5148a6af37
Error response from daemon: Cannot start container 56956e88c688d1587f651daa171a424976f7e303aeceae202fec9d5148a6af37: [8] System error: exec: "C:/Program Files/Git/etc/bootstrap.sh": stat C:/Program Files/Git/etc/bootstrap.sh: no such file or directory

ujjawalsinha
a year ago

not able to download/pull docker pull bigdatauniversity/spark

manasd
a year ago

bootstrap.sh missing !!! Has anyone been able to resolve the issue?

$ docker run -d --hostname bigdatauniversitySpark --name bdu_spark -P -p 8080:8080 -p 8081:8081 bigdatauniversity/spark:latest /etc/bootstrap.sh -d
cedb00c11479dbcde335e2aa4e96e2d6dbe20dc42d33318241910759019fff1b
Error response from daemon: Cannot start container cedb00c11479dbcde335e2aa4e96e2d6dbe20dc42d33318241910759019fff1b: [8] System error: exec: "C:/Program Files/Git/etc/bootstrap.sh": stat C:/Program Files/Git/etc/bootstrap.sh: no such file or directory

paubry
a year ago

Hello,

When I execute "docker run ....", I get this error:
C:/Program Files/Git/etc/bootstrap.sh: no such file or directory.

Any thoughts ?

Thank you in advance

mikeatlas
a year ago

@rvernica it's a large image intended for educational purposes, it definitely isn't exactly something you'd want to use in real dev/prod scenarios