Public | Automated Build

Last pushed: 7 months ago
Short Description
DocBleach - autoscaling system
Full Description

DocBleach aaS - Sanitize your documents in the cloud™

DocBleach allows you to sanitize your Word, Excel, PowerPoint, PDF, ... documents. This repository contains the
DocBleach Web API, packaged as a docker service. Two clicks and you'll feel safer.

You have files to sanitize but you can't install Java, may it be because you run on an embedded system or you deploy
your PHP app on a shared hosting and Java is not available ... This package is the solution to your issues!


Three modules are present in this repository:

  • api
    contains the Web API, a lightweight Python Flask app that
    receives the files and responds to status requests.
  • worker contains DocBleach and launches it using Python Celery.
  • autoscale auto-scaling for Mesos Marathon, documentation still has to be written.

Thanks to Docker, you are able to easily deploy this app.

Celery requires a message broker and a backend to store its results.
Using something as easy to setup as Redis is fine, as it is able to serve
both roles, but you may use another broker or another backend storage.

Configuration is given to each component thru environment variables,
valid values, for instance to use a local Redis server protected by the
password PASS123 you would use this:

In order to pass the files from the API to the Worker, a storage backend is required.
Plik is used, because it is easy to setup (unlike OpenStack/AWS), Open-Source and supports expire out of the box.

By default using Docker Compose, an internal Plik instance is started and the sanitized files are stored on

You may change this using the INTERNAL_PLIK_SERVER and FINAL_PLIK_SERVER env variables.


To run the containers using Docker Compose, just call docker-compose up -d in this project's directory.
Docker will take care of the boring stuff for you! :)

$ docbleach-rest$ docker-compose up -d
  Starting docbleachrest_worker_1
  Starting docbleachrest_plik_1
  Starting docbleachrest_web_1
  Starting docbleachrest_redis_1
$ docker ps
CONTAINER ID        IMAGE                                         COMMAND                  CREATED              STATUS              PORTS                              NAMES
5d258cfac1d6        docbleach/api:latest      "/usr/src/app/entrypo"   About a minute ago   Up 41 seconds>5000/tcp             docbleachrest_web_1
5c3d91acba01        docbleach/worker:latest   "/usr/src/app/entrypo"   About a minute ago   Up 41 seconds                                          docbleachrest_worker_1
ebcef5f636d5        rootgg/plik                                   "/bin/sh -c ./plikd"     9 days ago           Up 41 seconds       8080/tcp                           docbleachrest_plik_1
eb69034e73f5        redis                                         ""   10 days ago          Up 41 seconds       6379/tcp                           docbleachrest_redis_1

As you can see, DocBleach-api is running on your port 5000.
Just open in your browser to try it out :wink:.

Get the sources

    git clone
    cd DocBleach-Web 

A Dockerfile makes it easy to hack on a part of this project.

You've developed a new cool feature ? Fixed an annoying bug ? We'd be happy
to hear from you !

Documentation uses Swagger, and thus is rendered client-side using the specifications in

Related links



Project status

This project works, but would be greatly improved with little tweaks.

For instance, it would be really great to:

  • Write more documentation, because there's never enough of it. :-(
  • Have an improved design.
  • Improve the API once the code base is rewamped, to give an extended output, allow configuration...
  • Remove dependencies, having Java and Python code in a docker file is a bad practice. For now, it works.
Docker Pull Command
Source Repository