Short Description

Intel's Deep Learning Deployment Toolkit

Full Description

With the Deep Learning Deployment Toolkit you can:

- Optimize trained deep learning networks through model compression and weight quantization, which are tailored to end-point device characteristics
- Deliver a unified API to integrate inference with application logic

The Deep Learning Deployment Toolkit comprises two main components: Model Optimizer and Inference Engine.

Model Optimizer is a cross-platform command line tool that performs static model analysis and adjusts deep learning models for optimal execution on end-point target devices:

- Takes as input a trained network that contains a certain network topology, parameters, and the adjusted weights and biases. The input network is produced using the Caffe* framework
- Performs horizontal and vertical fusion of the network layers
- Prunes unused branches in the network
- Applies weights compression methods
- Produces as output an Internal Representation (IR) of the network - a pair of files that describe the whole model:

Topology file - an XML file that describes the network topology

Trained data file - a .bin file that contains the weights and biases binary data - The produced IR is used as an input for the Inference Engine.

Model Optimizer developer guide: https://software.intel.com/en-us/model-optimizer-devguide-introducing-deep-learning-model-optimizer

Inference Engine is a runtime that delivers a unified API to integrate the inference with application logic:

- Takes as input an IR produced by Model Optimizer
- Optimizes inference execution for target hardware
- Delivers inference solution with reduced footprint on embedded inference platforms

Inference Engine developer guide: https://software.intel.com/en-us/inference-engine-devguide-using-deep-learning-inference-engine

Docker Pull Command

Owner

intelcorp