Public | Automated Build

Last pushed: a month ago
Short Description
An engine that supplies the API that allows users to read regulations and their various layers.
Full Description

regulations-core




An API library that provides an interface for storing and retrieving regulations,
layers, etc.

This repository is part of a larger project. To read about it, please see
http://eregs.github.io/.

Features

  • Search integration with Elastic Search or Django Haystack
  • Support for storage via Elastic Search or Django Models
  • Separation of API into a read and a write portion
  • Destruction of regulations and layers into their components, allowing
    paragraph-level access
  • Schema checking for regulations

Requirements

This library requires

  • Python 2.7, 3.4, 3.5, or 3.6
  • Django 1.8, 1.9, 1.10, or 1.11

API Docs

regulations-core on Read The Docs

Local development

Tox

We use tox to test across multiple versions of Python
and Django. To run our tests, linters, and build our docs, you'll need to
install tox globally (Tox handles virtualenvs for us).

pip install tox
# If using pyenv, consider also installing tox-pyenv

Then, run tests and linting across available Python versions:

tox

To build docs, run:

tox -e docs

The output will be in docs/_build/dirhtml.

Running as an application

While this library is generally intended to be used within a larger project,
it can also be ran as its own application via
Docker or a local Python install. In both cases,
we'll run in DEBUG mode using SQLite for data storage. We don't have a turn
key solution for integrating this with search (though it can be accomplished
via a custom settings file).

To run via Docker,

docker build . -t eregs/core  # only needed after code changes
docker run -p 8080:8080 eregs/core

To run via local Python, run the following inside a
virtualenv:

pip install .
python manage.py migrate
python manage.py runserver 0.0.0.0:8080

In both cases, you can find the site locally at
http://0.0.0.0:8080/.

Apps included

This repository contains four Django apps, regcore, regcore_read,
regcore_write, and regcore_pgsql. The first contains shared models and
libraries. The "read" app provides read-only end-points while the "write" app
provides write-only end-points (see the next section for security
implications.) We recommend using regcore.urls as your url router, in which
case turning on or off read/write capabilities is as simple as including the
appropriate applications in your Django settings file. The final app,
regcore_pgsql contains all of the modules related to running with a
Postgres-based search index. Note that you will always need regcore
installed.

Security

Note that regcore_write is designed to only be active inside an
organization; the assumption is that data will be pushed to public facing,
read-only (i.e. without regcore_write) sites separately.

When using the Elastic Search backend, data is passed as JSON, preventing
SQL-like injections. When using haystack, data is stored via Django's model
framework, which escapes SQL before it hits the db.

All data types require JSON input (which is checked.) The regulation type
has an additional schema check, which is currently not present for other
data types. Again, this liability is limited by the segmentation of read and
write end points.

As all data is assumed to be publicly visible, data is not encrypted before
it is sent to the storage engine. Data may be compressed, however.

Be sure to override the default settings for both SECRET_KEY and to
turn DEBUG off in your local_settings.py

Storage-Backends

This project allows multiple backends for storing, retrieving, and searching
data. The default settings file uses Django models for data storage and
Haystack for search, but Elastic Search (1.7) or Postgres can be used instead.

Django Models For Data, Haystack For Search

This is the default configuration. You will need to have haystack installed
and one of their
backends.
In your settings file, use:

BACKENDS = {
    'regulations': 'regcore.db.django_models.DMRegulations',
    'layers': 'regcore.db.django_models.DMLayers',
    'notices': 'regcore.db.django_models.DMNotices',
    'diffs': 'regcore.db.django_models.DMDiffs'
}
SEARCH_HANDLER = 'regcore_read.views.haystack_search.search'

You will need to migrate the database (manage.py migrate) to get started and
rebuild the search index (manage.py rebuild_index) after adding documents.

Django Models For Data, Postgres For Search

If running Django 1.10 or greater, you may skip haystack and rely
exclusively on Postgres for search. The current search index only indexes at
the CFR section level. Install the psycopg (e.g. through pip install regcore[backend-pgsql]) and use the following settings:

BACKENDS = {
    'regulations': 'regcore.db.django_models.DMRegulations',
    'layers': 'regcore.db.django_models.DMLayers',
    'notices': 'regcore.db.django_models.DMNotices',
    'diffs': 'regcore.db.django_models.DMDiffs'
}
SEARCH_HANDLER = 'regcore_pgsql.views.search'
APPS.append('regcore_pgsql')

You may wish to extend the regcore.settings.pgsql module for simplicity.

You will need to migrate the database (manage.py migrate) to get started and
rebuild the search index (manage.py rebuild_pgsql_index) after adding
documents.

Elastic Search For Data and Search

If pyelasticsearch is installed (e.g. through pip install regcore[backend-elastic]), you can use Elastic Search (1.7) for both data
storage and search. Add the following to your settings file:

BACKENDS = {
    'regulations': 'regcore.db.es.ESRegulations',
    'layers': 'regcore.db.es.ESLayers',
    'notices': 'regcore.db.es.ESNotices',
    'diffs': 'regcore.db.es.ESDiffs'
}
SEARCH_HANDLER = 'regcore_read.views.es_search.search'

You may wish to extend the regcore.settings.elastic module for simplicity.

Settings

While we provide sane default settings in regcore/settings/base.py, we
recommend these defaults be overridden as needed in a local_settings.py file.

If using Elastic Search, you will need to let the application know how to
connect to the search servers.

  • ELASTIC_SEARCH_URLS - a list of strings which define how to connect
    to your search server(s). This is passed along to pyelasticsearch.
  • ELASTIC_SEARCH_INDEX - the index to be used by elastic search. This
    defaults to 'eregs'

The BACKENDS setting (as described above) must be a dictionary of the
appropriate model names ('regulations', 'layers', etc.) to the associated
backend class. Backends can be mixed and matched, though I can't think of a
good use case for that desire.

All standard Django and haystack settings are also available; you will likely
want to override DATABASES, HAYSTACK_CONNECTIONS, DEBUG and certainly
SECRET_KEY.

Importing Data

Via the eregs parser

The eregs script (see
regulations-parser) includes
subcommands which will write processed data to a running API. Notably, if
write_to (the last step of pipeline) is directed at a target beginning
with http:// or https://, it will write the relevant data to that host.
Note that HTTP authentication can be encoded within these urls. For example,
if the API is running on the localhost, port 8000, you could run:

$ eregs write_to http://localhost:8000/

See the command line
docs for
more detail.

Via the import_docs Django command

If you've already exported data from the parser, you may import it from the
command line with the import_docs Django management command. It should be
given the root directory of the data as its only parameter. Note that this
does not require a running API.

$ ls /path/to/data-root
diff  layer  notice  regulation
$ python manage.py import_docs /path/to/data-root

Via curl

You may also simulate sending data to a running API via curl, if you've
exported data from the parser. For example, if the API is running on the
localhost, port 8000, you could run:

$ cd /path/to/data-root
$ ls
diff  layer  notice  regulation
$ for TAIL in $(find */* -type f | sort -r) \
do \
    curl -X PUT http://localhost:8000/$TAIL -d @$TAIL \
done
Docker Pull Command
Owner
eregs
Source Repository

Comments (0)