Docker Image for automated neo4j backups to S3.
NB. As of 2016-05-20, this is a work in progress.
- Backing up and restoring - instructions for running a backup, and running a restore.
- Install Go, fleetctl and IntelliJ.
- Clone this repository.
- Open the project up in IntelliJ.
- Set up an SSH tunnel with a dynamic forwarding rule on port 1080.
Build and run:
go build ./coco-neo3j-bakup --socksProxy localhost:1080
Testing that everything builds ok:
docker build -t $(basename $PWD) .
Releasing a new version:
Tag the release according to semantic versioning principles:
git tag 0.x.0 git push --tag
Check that Docker Hub built it ok: https://hub.docker.com/r/coco/coco-neo4j-backup/builds/
- Update the version in
services.yamlvia a branch/PR.
- Wait for the deployer to deploy the service.
The below items may want to be implemented at some point, perhaps when we start "backup 2.0" if/when we start using hot backups
with Neo4j Enterprise.
- Shamelessly plagiarise
neo4j-backup.timer, for scheduled backups.
- Upload backups into a folder inside the bucket in a format something like
- Write a health check.
- Lock down the version in services.yaml to a specific tag. DONE
- Write more tests. Always more tests.
- Print a link to the backup archive in S3.
- Check CPU usage, then see if using an LZ4 compressor reduces CPU usage (potentially at the cost of a larger backup file).
- Switch to using a library like env-decode for much simpler parsing of environment variables without needing CLI params,
which are unnecessary for most apps.
- Make it possible to back up red or blue (rather than just red). (Requested by Scott 2016-07-15)
TODO items that will probably no longer be necessary once we have hot neo4j backups
ionicein front of the
nice rsyncstatement, to further reduce resource usage
(suggested by martingartonft on 2016-07-11). NB. will no longer be necessary once we are doing
hot backups, although we might want to run the entire service under a low process priority.
- Stop and start the deployer programmatically to avoid neo4j being accidentally started up during a backup.
It would be wise to restart
deployer.serviceprogrammatically immediately after neo is started back up, rather than doing it manually
after the backup process is complete, which will keep the deployer "outage" much shorter
(added by [duffj]https://github.com/duffj) on 2016-07-12).
- Shut down neo4j's dependencies.
- Start up neo4j's dependencies.
Potential Bugs and Known Issues
- (MINOR) Fix the "startTime log" bug where it doesn't show the time properly.
- (MINOR) When deployer is still active, you get two very similar-looking error messages in the log.
- The final log message seems to be
Started Job to backup neo4j DB data files to S3., which comes after the message
Backup process complete., generated by systemd rather than the application itself. It's confusing; if we can fix it that might be a good thing.
- [2016-07-15] When taking a backup from prod-us today, the process seems to have been restarted halfway through by systemd. Not sure what happened but it might happen again.
Ideas for automated tests
- A test that instantiates neo4j, writes some simple data, backs it up, restores it, and tests that it still works as desired.
Notes and Questions
This thing has to run on the same box as neo4j, right? Is that possible/easy to do in a container-based world?
- A: Actually, it'll run in its own container and mount the neo4j volume to access the files.
Why does my IntelliJ build fail when I try to access a function in another file in the same directory? See below:
$ go build -o "/private/var/folders/rt/8c3952t54cd5q7x08z4m6j5m0000gn/T/Build main.go and rungo" /Users/dafydd/dev/go/src/github.com/Financial-Times/coco-neo4j-backup/main.go src/github.com/Financial-Times/coco-neo4j-backup/main.go:24: undefined: backup
- A: To fix this problem, change the working directory for the run configuration to be the home directory of the project.
This service needs access to the neo4j file system. It therefore relies on the
/vol partition being present on the host machine,
so that it can be mounted into the container for the
rsync process. The original plan was