Public | Automated Build

Last pushed: 2 years ago
Short Description
Full Description

PyPDFOCR on Docker

get rid of your paperwork...

This Docker image is based on the official Ubuntu base image.

How to use this image
docker run --rm andre74/pypdfocr_nl [-h] [-d] [-v] [-m] [-l LANG] [--preprocess]
[--skip-preprocess] [-w WATCH_DIR] [-f] [-c CONFIGFILE] [-e] [-n] [pdf_filename]
Case 1: Single Document
docker run -v ~/:/media --rm andre74/pypdfocr_nl /media/filename.pdf
--> reads filename.pdf from your Home directory, filename_ocr.pdf will be generated

Case 2 : Watch folder
docker run -v ~/Documents/Paper:/media --rm andre74/pypdfocr_nl -w /media -f -c /media/config.yaml
For sample config see config.yaml or pypdfocr authors repository here.

docker run --rm andre74/pypdfocr_nl [-h] [-d] [-v] [-m] [-l LANG] [--preprocess]
[--skip-preprocess] [-w WATCH_DIR] [-f] [-c CONFIGFILE] [-e]
Interactive Shell

docker run --entrypoint=/bin/bash -t -i andre74/pypdfocr_nl
How i use this image
I use Scanner Pro on iOS (scanbot on Android) to scan and upload documents to a WebDAV folder without OCR
The WebDAV folder is hosted on my Synology DiskStation NAS via HTTPS and shared between devices with CloudStation
I run this PyPDFOCR on Docker manually on Mac OS X or hosted on a local server
This way my personal documents don't have to leave my hardware or network aka personal cloud.

Docker Pull Command
Source Repository