Public | Automated Build

Last pushed: 4 months ago
Short Description
data-profiler is used to automate the analysis and modeling of an operator for a variety of datasets
Full Description


data-profiler is a Go project used to transform a set of datasets, based on a set of characteristics (distribution similarity, correlation, etc.), in order to model the behavior of an operator, applied on top of them using Machine Learning techniques.



You have two ways of installing data-profiler:

  1. Through Go:
# GOPATH must be set
~> go get
  1. Using Docker:
~> docker pull ggian/data-profiler


data-profiler can be used both through a CLI and a Web interface.

  1. CLI

You can access the CLI client through the data-profiler-utils binary.

~> $GOPATH/bin/data-profiler-utils

This previous command will give an overview of the available actions.

Note: use this client only if you know how data-profiler works.

  1. Web UI

First run the Docker container, providing a directory with the dataset files.

~> docker run -v /src/datasets:/datasets -p 8080:8080 -d ggian/data-profiler

This command mounts the host's /src/datasets directory to the container and forwards the host's 8080 port to the container. After the successful start of the container, go to http://dockerhost:8080 and insert the first set of datasets for analysis.


Apache License v2.0 (see LICENSE file for more)


Giannis Giannakopoulos

Docker Pull Command
Source Repository