Public | Automated Build

Last pushed: a year ago
Short Description
Image feature extraction or prediction with pre-trained VGG16 deep neural network.
Full Description

VGG 16 inside Docker

Easily extract image features from various layers of VGG16 with this Docker image.
Or just use it in prediction mode to get labels for input images.

The Docker image contains a pre-trained VGG16 model along with scripts to load images from a directory and to extract features from them.

Run docker run -it --rm -v $DATA_DIR:/data -v $OUTPUT_DIR:/output dominicbreuker/vgg_docker:latest python /vgg_16/ to extract features from all images in $DATA_DIR.
Resuts will be written to $OUTPUT_DIR.

You can pass various arguments to

  • --mode (-m) defines the kind of feature you want to generate. 4 modes are available:
    • label: returns the prediction in plain english (e.g, 'ipod'). You can see a list of labels in /vgg_16/synset_words.txt.
    • softmax: returns a 1000-dim vector with class probabilities for each of the 1000 possible ImageNet labels.
    • dense: returns a 4096-dim vector extracted from the layer immediately before softmax.
    • convolutional: returns the output of the last convolutional layer
  • --height (-hs) and --width (-ws) define the image size. Default is 256x256. Can only be changed if in convolutional mode. Otherwise, images must be resized to 256x256 like it was done in the ImageNet competition.
  • --extension (-e) defines the files you can look for in $DATA_DIR. Defaults to jpg. The script will process all images with the given extension anywhere in the file tree below $DATA_DIR.

Defaults for each mode are as follows:
docker run -it --rm -v $DATA_DIR:/data -v $OUTPUT_DIR:/output dominicbreuker/vgg_docker:latest python /vgg_16/ -m label -hs 256 -ws 256 -e jpg

After running this script, you will find the following two files in $OUTPUT_DIR:

  • image_files_vgg16_label_256x256_<timestamp>.npz with a list of image file names (your IDs)
  • extractions_vgg16_label_256x256_<timestamp>.npz with a list of features

VGG Background

VGG is a pre-trained CNN created using the ImageNet dataset.
Read this paper for details regarding the model.
Or check out the ILSVRC 2014 results to see that it made 1st place in the competition.
Must be a good one ;)
You can use it to build powerful image processing tools by transferring the knowledge within the model to your application, like described here.

How it is built


The Docker image contains pre-trained weights taken from this Gist.
These weights are a direct transformation of the original authors' Caffe model.
They are stored in /weights/vgg16_weights_tensorflow.h5.

Image pre-processing

The script will pre-process images in the same way it was done during creation of VGG16.
The steps include subtraction of mean pixel values based on ImageNet (seperatly by color) and cropping out borders of the image (e.g., of 256x256 image, only the center 224x224 pixels are retained).


To see if you are using the weights correctly, check out /vgg_16/
It will predict the top5 class labels for each /vgg_16/test_images/*.jpg.
This script is run during the Docker image build to verify predictions are reasonable.

Sources of test images:

Docker Pull Command
Source Repository