DataKit -- Orchestrate applications using a Git-like dataflow
DataKit is a tool to orchestrate applications using a Git-like dataflow. It
revisits the UNIX pipeline concept, with a modern twist: streams of
tree-structured data instead of raw text. DataKit allows you to define
complex build pipelines over version-controlled data.
There are several components in this repository:
srccontains the main DataKit service. This is a Git-like database to which other services can connect.
cicontains DataKitCI, a continuous integration system that uses DataKit to monitor repositories and store build results.
ci/self-ciis the CI configuration for DataKitCI that tests DataKit itself.
bridge/githubis a service that monitors repositories on GitHub and syncs their metadata with a DataKit database.
e.g. when a pull request is opened or updated, it will commit that information to DataKit. If you commit a status message to DataKit, the bridge will push it to GitHub.
bridge/localis a drop-in replacement for
bridge/githubthat just monitors a local Git repository. This is useful for local testing.
The easiest way to use DataKit is to start both the server and the client in containers.
To expose a Git repository as a 9p endpoint on port 5640 on a private network, run:
$ docker network create datakit-net # create a private network $ docker run -it --net datakit-net --name datakit -v <path/to/git/repo>:/data datakit/db
--name datakit option is mandatory. It will allow the client
to connect to a known name on the private network.
You can then start a DataKit client, which will mount the 9p endpoint and
expose the database as a filesystem API:
# In an other terminal $ docker run -it --privileged --net datakit-net datakit/client $ ls /db branch remotes snapshots trees
--privileged option is needed because the container will have
to mount the 9p endpoint into its local filesystem.
Now you can explore, edit and script
/db. See the
for more details.
docker build -t datakit/db -f Dockerfile . docker run -p 5640:5640 -it --rm datakit/db --listen-9p=tcp://0.0.0.0:5640
These commands will expose the database's 9p endpoint on port 5640.
$ make depends $ make && make test
For information about command-line options:
$ datakit --help
Prometheus metric reporting
--listen-prometheus 9090 to expose metrics at
Note: there is no encryption and no access control. You are expected to run the
database in a container and to not export this port to the outside world. You
can either collect the metrics by running a Prometheus service in a container
on the same Docker network, or front the service with nginx or similar if you
want to collect metrics remotely.
- Go bindings are in the
- OCaml bindings are in the
examples/ocaml-clientfor an example.
DataKit is licensed under the Apache License, Version 2.0. See
LICENSE for the full