Public Repository

Last pushed: 2 years ago
Short Description
Short description is empty for this repo.
Full Description

This code in python is about implementing k-mean clustering. Which use standard k-mean method when the txt file is small, otherwise use k-mean+* method.

To run this code in container use
docker run -itv yourdirectory:/mnt lanzhong/k-meancluster /opt/spark/bin/spark-submit /src/
filename k

yourdirectory : local directory to view the clustering results
lanzhong/k-meancluster : the image name
filename : txt file name of dataset
k : the number of cluster

one example is as following
docker run -itv /Users/lanzhong/Desktop:/mnt
lanzhong/k-meancluster /opt/spark/bin/spark-submit /src/ 'mickey1.txt' 3

The visualization of clustering is sotred as /Users/lanzhong/Desktop.
You can set it as other local directory.

Docker Pull Command