intelanalytics/hyper-zoo
Analytics Zoo hyperzoo image has been built to easily run applications on Kubernetes cluster.
2.0K
Analytics Zoo hyperzoo image has been built to easily run applications on Kubernetes cluster. The details of pre-installed packages and usage of the image will be introduced in this page.
LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the “third-party-programs.txt” or other similarly-named text file for additional details.
Launch pre-built hyperzoo image
Pull an Analytics Zoo hyperzoo image from dockerhub:
sudo docker pull intelanalytics/hyper-zoo:latest
Speed up pulling image by adding mirrors To speed up pulling the image from dockerhub in China, add a registry's mirror. For Linux OS (CentOS, Ubuntu etc), if the docker version is higher than 1.12, config the docker daemon. Edit /etc/docker/daemon.json and add the registry-mirrors key and value:
{
"registry-mirrors": ["https://<my-docker-mirror-host>"]
}
For example, add the ustc mirror in China.
{
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"]
}
Flush changes and restart docker:
sudo systemctl daemon-reload
sudo systemctl restart docker
If you would like to speed up pulling this image on MacOS or Windows, find the docker setting and config registry-mirrors section by specifying mirror host. Restart docker.
Then pull the image. It will be faster.
sudo docker pull intelanalytics/hyper-zoo:latest
Launch a k8s client container
Please note the two different containers: client container is for user to submit zoo jobs from here, since it contains all the required env and libs except hadoop/k8s configs; executor container is not need to create manually, which is scheduled by k8s at runtime.
sudo docker run -itd --net=host \
-v /etc/kubernetes:/etc/kubernetes \
-v /root/.kube:/root/.kube \
intelanalytics/hyper-zoo:latest bash
To specify more argument, use:
sudo docker run -itd --net=host \
-v /etc/kubernetes:/etc/kubernetes \
-v /root/.kube:/root/.kube \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
-e RUNTIME_SPARK_MASTER=k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \
-e RUNTIME_K8S_SERVICE_ACCOUNT=account \
-e RUNTIME_K8S_SPARK_IMAGE=intelanalytics/hyper-zoo:latest \
-e RUNTIME_PERSISTENT_VOLUME_CLAIM=myvolumeclaim \
-e RUNTIME_DRIVER_HOST=x.x.x.x \
-e RUNTIME_DRIVER_PORT=54321 \
-e RUNTIME_EXECUTOR_INSTANCES=1 \
-e RUNTIME_EXECUTOR_CORES=4 \
-e RUNTIME_EXECUTOR_MEMORY=20g \
-e RUNTIME_TOTAL_EXECUTOR_CORES=4 \
-e RUNTIME_DRIVER_CORES=4 \
-e RUNTIME_DRIVER_MEMORY=10g \
intelanalytics/hyper-zoo:latest bash
Once the container is created, launch the container by:
sudo docker exec -it <containerID> bash
Then you may see it shows:
root@[hostname]:/opt/spark/work-dir#
/opt/spark/work-dir is the spark work path.
Note: The /opt directory contains:
README URL: https://github.com/intel-analytics/analytics-zoo/blob/master/docker/hyperzoo/README.md
Explore more container solutions on the Intel® oneContainer Portal
docker pull intelanalytics/hyper-zoo