Public Repository

Last pushed: 2 years ago
Short Description
Short description is empty for this repo.
Full Description

Example of job execution into a hadoop cluster using aurelbruno06/hadoop_worker image.
The Jobs is a Java WordCount parsing one file from Gutenberg books. The book is a extract from Leonardo Da Vinci Notebook.
To use this image you need to run it as interactive then execute to following command

  • start the image after your cluster is running
    docker run -it --rm --net cluster-5_hadoop-ring --link master --entrypoint '/bin/bash' aurelbruno06/hadoop_job
  • inside the image
    sed s/HOSTNAME/master/ /usr/local/hadoop/etc/hadoop/core-site.xml.tmp > /usr/local/hadoop/etc/hadoop/core-site.xml
    hadoop fs -mkdir /data
    hadoop fs -put /root/5000-8.txt /data
    hadoop fs -ls /data
    hadoop jar /root/WordCount.jar WordCountDriver /data /results
    [ computation ]
    hadoop fs -cat /results
    Dockerfile and more detail can be found there: in the hadoop directory
Docker Pull Command