Public Repository

Last pushed: a year ago
Short Description
PDF Parse Demo based on Cloudera Quickstart image.
Full Description

This image pulls public PDF documents from S3 and loads them into HBase for text mining.
The foundation for the image is described in the Cloudera's Blog: http://blog.cloudera.com/blog/2015/10/how-to-index-scanned-pdfs-at-scale-using-fewer-than-50-lines-of-code/

Docker Pull Command
Owner
escvector

Comments (0)