Public | Automated Build

Last pushed: 2 years ago
Short Description
This API wraps the Java boilerpipe library into an HTTP API to extract raw article text from HTML
Full Description

This API wraps the Java boilerpipe library into an HTTP API to extract raw article text from HTML pages.

Usage

There are two ways to use the API. You can either pass a url or raw html:

curl -X POST http://localhost:3000/extract -H "Content-Type: application/json" -d '
  {
    "url": "http://techcrunch.com/2014/07/07/matterport-16m-dcm/"
  }
'
curl -X POST http://localhost:3000/extract -H "Content-Type: application/json" -d '
  {
    "html": "YOUR HTML CODE HERE"
  }
'

Running

The easiest way to run the API is using Docker. A published version is available as blikk/boilerpipe-api on Dockerhub.

Docker Pull Command
Owner
tasubo
Source Repository

Comments (1)
stuffo
a year ago

I've tried to deploy this in our Mesos cluster and it didn't work. I've also tried to deploy it locally on my machine, still nothing. it looks like the process is not starting. Running it in the interactive mode didn't display anything in the console.