This code in python is about implementing k-mean clustering. Which use standard k-mean method when the txt file is small, otherwise use k-mean+* method.
To run this code in container use
docker run -itv yourdirectory:/mnt lanzhong/k-meancluster /opt/spark/bin/spark-submit /src/k_mean_cluster.py
yourdirectory : local directory to view the clustering results
lanzhong/k-meancluster : the image name
filename : txt file name of dataset
k : the number of cluster
one example is as following
docker run -itv /Users/lanzhong/Desktop:/mnt
lanzhong/k-meancluster /opt/spark/bin/spark-submit /src/k_mean_cluster.py 'mickey1.txt' 3
The visualization of clustering is sotred as /Users/lanzhong/Desktop.
You can set it as other local directory.