Public | Automated Build

Last pushed: 2 years ago
Short Description
Service for monitoring scrapers and collectors.
Full Description

Scraper Monitoring API

Service designed to collect success and error information from scrapers and collectors.

Usage

The API has the following working methods:

  • / GET: Retrieves a running list of scraper status.
  • / POST: Stores a record of a scraper status. It needs the following arguments:
    • id: Scraper id. Scrapers should have unique id.
    • status: Either error or ok.
    • message: A string with the message. Required in case of error.
    • time: An ISO 8601 time stamp (up to seconds).
    • datasets: An array with dataset ids (not the hashes).

Example request:

$ curl -X POST localhost:4000/ \
  -d "id=scraper-test&status=error&message='Failed to \
  connect to API.'&time=2015-06-01T14:34:01&datasets=ebola-data,hospitals-dataset"

Docker Setup

Review the Dockerfile and run it linking to a MongoDB instance. make setup will try to setup its own collection in the instance (called scraper_status). This image doesn't need a volume mounted, but it needs the following environment variables in order to work appropriately:

  • MONOGDB_SCRAPER_STATUS_USER_NAME: Dedicated user name for manipulating collections.
  • MONGODB_SCRAPER_STATUS_USER_PASSWORD: Password for the user above.

Those should be passed when running the image.

$ docker run -d --name scraper_status \
  --link mongo:mongo \
  -e MONOGDB_SCRAPER_STATUS_USER_NAME=foo \
  -e MONGODB_SCRAPER_STATUS_USER_PASSWORD=bar \
  luiscape/hdx-monitor-scraper-status:latest
Docker Pull Command
Owner
luiscape