Public Repository

Last pushed: 23 days ago
Short Description
Download Edgar XBRL filings and expose numerical data with REST service
Full Description

Overview

Download Edgar xbrl filings from https://www.sec.gov/edgar.shtml into the local file system or into a PostgreSQL database. This image is based on Alpine (https://hub.docker.com/_/alpine/) and uses only 283 MB

Parameters

Generic

  • driver: Implementation which is executing and processing the download (e.g. ch.pschatzmann.edgar.dataload.DownloadProcessorXbrlFile, ch.pschatzmann.edgar.dataload.DownloadProcessorJDBC)
  • updateIndex: Defines if the local index which drives the download should be updated from EDGAR (true or false)
  • formsRegex: Regex which selects for form (e.g. 10-K.*)
  • minPeriod: Starting period in the form yyyy-mm (e.g. 2017-05)
  • indexFileName: absolute file path to index file
  • timer: number of minutes to wait until the processing of new data is re-triggered (e.g. 60)

Download as Files

  • destinationFolder: local folder which is used as download destination

Loading into PostgreSQL

  • jdbcDriver: jdbc class name (org.postgresql.Driver)
  • jdbcURL: jdbc connection string. (e.g. jdbc:postgresql://db-edgar:5432/edgar)
  • jdbcUser: PostgreSQL user name
  • jdbcPassword: PostgreSQL password

Download as Files

The download into the local file system can be triggered with the following docker-compose.yml

    version: '3.0'
    services:
      smart-edgar:
        image: pschatzmann/smart-edgar
        container_name: edgar-file
        environment:
          - updateIndex=true
          - formsRegex=10-Q.*|10-K.*
          - minPeriod=2017-05
          - driver=ch.pschatzmann.edgar.dataload.DownloadProcessorXbrlFile
          - indexFileName=/usr/local/bin/SmartEdgar/data/index.csv
          - destinationFolder=/usr/local/bin/SmartEdgar/data/
        volumes:
          - /srv/SmartEdgar:/usr/local/bin/SmartEdgar/data/

Download into Database

We import only the numeric parameters into the edgar database. All text and html parameters are ignored.
The values are imported into the values table. The related company information is made available in a separate company table.

    version: '3.0'
    services:
      db-edgar:
        image: postgres:9.4
        container_name: db-edgar
        restart: always
        environment:
          - POSTGRES_USER=edgar
          - POSTGRES_PASSWORD=edgar
        labels:
          - "traefik.enable=false"
        volumes:
          - /var/lib/postgresql/data
          - /backups:/backups
        ports:
          - 5433:5432

      smart-edgar:
        image: pschatzmann/smart-edgar
        container_name: edgar-db
        restart: always
        environment:
          - updateIndex=true
          - formsRegex=10-K.*
          - minPeriod=2017-05
          - driver=ch.pschatzmann.edgar.dataload.DownloadProcessorJDBC
          - indexFileName=/usr/local/bin/SmartEdgar/data/index.csv
          - jdbcDriver=org.postgresql.Driver
          - jdbcURL=jdbc:postgresql://db-edgar:5432/edgar
          - jdbcUser=edgar
          - jdbcPassword=edgar
        links:
          - db-edgar
        volumes:
          - /srv/SmartEdgar:/usr/local/bin/SmartEdgar/data/
        ports:
          - "9997:9997"

REST Service

After the data has been loaded into the database the following services are available on port 9997:

Docker Pull Command
Owner
pschatzmann

Comments (0)