DocBleach aaS - Sanitize your documents in the cloud™
DocBleach allows you to sanitize your Word, Excel, PowerPoint, PDF, ... documents. This repository contains the
DocBleach Web API, packaged as a docker service. Two clicks and you'll feel safer.
You have files to sanitize but you can't install Java, may it be because you run on an embedded system or you deploy
your PHP app on a shared hosting and Java is not available ... This package is the solution to your issues!
Three modules are present in this repository:
contains the Web API, a lightweight Python Flask app that
receives the files and responds to status requests.
- worker contains DocBleach and launches it using Python Celery.
- autoscale auto-scaling for Mesos Marathon, documentation still has to be written.
Thanks to Docker, you are able to easily deploy this app.
Celery requires a message broker and a backend to store its results.
Using something as easy to setup as Redis is fine, as it is able to serve
both roles, but you may use another broker or another backend storage.
Configuration is given to each component thru environment variables,
CELERY_RESULT_BACKEND. They accept URI as
valid values, for instance to use a local Redis server protected by the
PASS123 you would use this:
In order to pass the files from the API to the Worker, a storage backend is required.
Plik is used, because it is easy to setup (unlike OpenStack/AWS), Open-Source and supports
expire out of the box.
By default using Docker Compose, an internal Plik instance is started and the sanitized files are stored on plik.root.gg
You may change this using the
FINAL_PLIK_SERVER env variables.
To run the containers using Docker Compose, just call
docker-compose up -d in this project's directory.
Docker will take care of the boring stuff for you! :)
$ docbleach-rest$ docker-compose up -d Starting docbleachrest_worker_1 Starting docbleachrest_plik_1 Starting docbleachrest_web_1 Starting docbleachrest_redis_1 $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5d258cfac1d6 docbleach/api:latest "/usr/src/app/entrypo" About a minute ago Up 41 seconds 0.0.0.0:9000->5000/tcp docbleachrest_web_1 5c3d91acba01 docbleach/worker:latest "/usr/src/app/entrypo" About a minute ago Up 41 seconds docbleachrest_worker_1 ebcef5f636d5 rootgg/plik "/bin/sh -c ./plikd" 9 days ago Up 41 seconds 8080/tcp docbleachrest_plik_1 eb69034e73f5 redis "docker-entrypoint.sh" 10 days ago Up 41 seconds 6379/tcp docbleachrest_redis_1
As you can see, DocBleach-api is running on your port
127.0.0.1:5000 in your browser to try it out :wink:.
Get the sources
git clone https://github.com/docbleach/DocBleach-Web.git cd DocBleach-Web
Dockerfile makes it easy to hack on a part of this project.
You've developed a new cool feature ? Fixed an annoying bug ? We'd be happy
to hear from you !
Documentation uses Swagger, and thus is rendered client-side using the specifications in
- Contribute: https://github.com/docbleach/DocBleach-Web
- Report bugs: https://github.com/docbleach/DocBleach-Web/issues
- Get latest version: https://hub.docker.com/u/docbleach/
This project works, but would be greatly improved with little tweaks.
For instance, it would be really great to:
- Write more documentation, because there's never enough of it. :-(
- Have an improved design.
- Improve the API once the code base is rewamped, to give an extended output, allow configuration...
- Remove dependencies, having Java and Python code in a docker file is a bad practice. For now, it works.