Advanced Database Technologies (Classification Assignment)
Author: Evert Duipmans
Goal of the project is to write a digit recognizer using SimpleCV and ScikitLearn. This project works with a docker container in which the following libraries are installed: Python 2.7, IPython Jupyter, SimpleCV, NumPy, ScikitLearn and Matplotlib.
Follow the steps below to get the container up and running.
Before running the container
Before installing and running the container it is important to perform the following steps:
- Install docker on your platform.
- IMPORTANT: if you are running on Windows, add the drive on which this folder is situated as shared drive in the docker settings.
Running the container
- Startup the docker container:
sudo docker run -d -p 54717:8888 -v $(pwd):/host -it saxion/adt-classification-ex3
- Open your webbrowser and go to: http://localhost:54717 and check if the notebook works (use password: adt)
- Test one of the demo scripts (in notebooks/examples)
This folder contains the following subfolders:
- Folder dataset-images: this folder contains all the images that should be processed in your script
- Folder dataset-numpy: after extracting features from the images, store the created datasets here (hint: use np.save() function for storing numpy arrays as file)
- Folder classifier: after the training/testing phase, you can export the (best) trained model to a file (hint: use joblib.dump() function)
Assignment in steps
- Extract features from the images and create 2 or more datasets (and export these datasets to the dataset-numpy folder).
- Analyse your data using numpy operations and matplotlib.
- Pre-process your features and train/test classifiers. Export the best classifier with joblib to the classifier folder.
- Build a recognizer that is able to proces an image and ask the classifier for the outcome.