What is Apache Bookkeeper?
Apache Bookkeeper is a software project of the Apache Software Foundation, providing a replicated log service which can be used to build replicated state machines. A log contains a sequence of events which can be applied to a state machine. BookKeeper guarantees that each replica state machine will see all the same entries, in the same order.
How to use this image
Bookkeeper needs Zookeeper in order to preserve its state and publish its bookies (Bookkeeper servers). The client only need to connect to a Zookeeper server in the ensamble in order to obtain the list of Bookkeeper servers.
standalone BookKeeper cluster
Just like running a BookKeeper cluster in one machine(http://bookkeeper.apache.org/docs/latest/getting-started/run-locally/), you can run a standalone BookKeeper in one docker container, the command is:
docker run -it \ --env JAVA_HOME=/usr/lib/jvm/jre-1.8.0 \ --entrypoint "/bin/bash" \ apache/bookkeeper \ -c "/opt/bookkeeper/bin/bookkeeper localbookie 3"
Note: you can first start the container, and then execute "bin/bookkeeper localbookie 3" in the container.
After that, you can execute BookKeeper shell command(http://bookkeeper.apache.org/docs/latest/reference/cli/) to test the cluster, you need first log into the container, use command below:
docker exec -it <container id or name> bash
then run test command, such as:
./bin/bookkeeper shell listbookies -rw ./bin/bookkeeper shell simpletest
TL;DR -- BookKeeper cluster
If you want to setup cluster, you can play with Makefile or docker-compose hosted in this project and check its targets for a fairly complex set up example:
git clone https://github.com/apache/bookkeeper cd bookkeeper/docker docker-compose up -d
and it spawns and ensemble of 3 bookies with 1 dice:
bookie1_1 | 2017-12-08 23:18:11,315 - INFO - [bookie-io-1:BookieRequestHandler@51] - Channel connected [id: 0x405d690e, L:/172.19.0.3:3181 - R:/172.19.0.6:34922] bookie2_1 | 2017-12-08 23:18:11,326 - INFO - [bookie-io-1:BookieRequestHandler@51] - Channel connected [id: 0x7fa8645d, L:/172.19.0.4:3181 - R:/172.19.0.6:38862] dice_1 | Value = 1, epoch = 5, leading dice_1 | Value = 2, epoch = 5, leading dice_1 | Value = 4, epoch = 5, leading dice_1 | Value = 3, epoch = 5, leading
If you want to see how it behaves with more dices you only need to use
docker-compose up -d --scale dice=3:
dice_3 | Value = 3, epoch = 5, following dice_2 | Value = 3, epoch = 5, following dice_1 | Value = 2, epoch = 5, leading dice_3 | Value = 2, epoch = 5, following dice_2 | Value = 2, epoch = 5, following dice_1 | Value = 1, epoch = 5, leading dice_3 | Value = 2, epoch = 5, following dice_2 | Value = 2, epoch = 5, following
You can scale the numbers of bookkeepers too selecting one of the bookies and
docker-compose up -d --scale bookie1=3
Remember to shutdown the docker-compose service with
docker-compose down to
remove the containers and avoid errors with leftovers in next executions of the
git clone https://github.com/apache/bookkeeper cd bookkeeper/docker make run-demo
While, if you don't have access to a X environment, e.g. on default MacOS, It has to run the last command manually in 6 terminals respectively.
make run-zk make run-bk BOOKIE=1 make run-bk BOOKIE=2 make run-bk BOOKIE=3 make run-dice make run-dice
This will do all the following steps and start up a working ensemble with two dice applications.
Note: If you want to restart from scratch the cluster, please remove all its data using
sudo rm -rf /tmp/test_bk, else the Zookeeper connection maybe fail.
Step by step
The simplest way to let Bookkeeper servers publish themselves with a name, which could be resolved consistently across container runs, is through creation of a docker network:
docker network create "bk_network"
Then we can start a Zookeeper (from Zookeeper official image) server in standalone mode on that network:
docker run -d \ --network "bk_network" \ --name "test_zookeeper" \ --hostname "test_zookeeper" \ zookeeper
And initialize the metadata store that bookies will use to store information (This step is necessary when we setup the BookKeeper cluster first time, but is not necessary when using our current BookKeeper Docker image, because we have done this work when we start the first bookie):
docker run -it --rm \ --network "bk_network" \ --env BK_zkServers=test_zookeeper:2181 \ apache/bookkeeper \ bookkeeper shell metaformat
Now we can start our Bookkeeper ensemble (e.g. with three bookies):
docker run -it\ --network "bk_network" \ --env BK_zkServers=test_zookeeper:2181 \ --name "bookie1" \ --hostname "bookie1" \ apache/bookkeeper
And so on for "bookie2" and "bookie3". We have now our fully functional ensemble, ready to accept clients.
This application check if it can be leader, if yes start to roll a dice and book this rolls on Bookkeeper, otherwise it will start to follow the leader rolls. If leader stops, follower will try to become leader and so on.
Start a dice application (you can run it several times to view the behavior in a concurrent environment):
docker run -it --rm \ --network "bk_network" \ --env ZOOKEEPER_SERVERS=test_zookeeper:2181 \ caiok/bookkeeper-tutorial
Bookkeeper configuration is located in
/opt/bookkeeper/conf in the docker container, it is a copy of these files in Bookkeeper repo.
There are 2 ways to set Bookkeeper configuration:
1, Apply setted (e.g. docker -e kk=vv) environment variables into configuration files. Environment variable names is in format "BK_originalName", in which "originalName" is the key in config files.
2, If you are able to handle your local volumes, use
docker --volume command to bind-mount your local configure volumes to
Example showing how to use your own configuration files:
$ docker run --name bookie1 -d \ -v $(local_configure_dir):/opt/bookkeeper/conf/ \ < == use 2nd approach, mount dir contains config_files -e BK_bookiePort=3181 \ < == use 1st approach, set bookiePort -e BK_zkServers=zk-server1:2181,zk-server2:2181 \ < == use 1st approach, set zookeeper servers -e BK_journalPreAllocSizeMB=32 \ < == use 1st approach, set journalPreAllocSizeMB in [bk_server.conf](https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/conf/bk_server.conf) apache/bookkeeper
Override rules for Bookkeeper configuration
If you have applied several ways to set the same config target, e.g. the environment variable names contained in these files and conf_file in /opt/bookkeeper/conf/.
Then the override rules is as this:
Environment variable names contained in these files, e.g.
Values in /opt/bookkeeper/conf/conf_files.
Take above example, if in docker instance you have bind-mount your config file as /opt/bookkeeper/conf/bk_server.conf, and in it contains key-value pair:
zkServers=zk-server3:2181, then the value that take effect finally is
-e BK_zkServers=zk-server1:2181,zk-server2:2181 will override key-value pair:
zkServers=zk-server3:2181, which contained in /opt/bookkeeper/conf/bk_server.conf.
Environment variable names that mostly used for your configuration.
This variable allows you to specify the port on which Bookkeeper should listen for incoming connections.
This will override
bookiePort in bk_server.conf.
Default value is "3181".
This variable allows you to specify a list of machines of the Zookeeper ensemble. Each entry has the form of
host:port. Entries are separated with a comma.
This will override
zkServers in bk_server.conf.
Default value is "127.0.0.1:2181"
This variable allows you to specify the root directory Bookkeeper will use on Zookeeper to store ledgers metadata.
This will override
zkLedgersRootPath in bk_server.conf.
Default value is "/bookkeeper/ledgers"
This variable allows you to specify the root directory Bookkeeper will use on Zookeeper.
Default value is empty - " ". so ledgers dir in zookeeper will be at "/ledgers" by default. You could set it as that you want, e.g. "/bookkeeper"
This variable allows you to specify where to store data in docker instance.
This could be override by env vars "BK_journalDirectory", "BK_ledgerDirectories", "BK_indexDirectories" and also
indexDirectories in bk_server.conf.
Default value is "/data/bookkeeper", which contains volumes
/data/bookkeeper/index to hold Bookkeeper data in docker.
Configure files under /opt/bookkeeper/conf
Usually we could config files bk_server.conf, bkenv.sh, log4j.properties, and log4j.shell.properties. Please read and understand them before you do the configuration.
Be careful where you put the transaction log (journal). A dedicated transaction log device is key to consistent good performance. Putting the log on a busy device will adversely effect performance.
Here is some useful and graceful command the could be used to replace the default command, once you want to delete the cookeis and do auto recovery:
/bookkeeper/bookkeeper-server/bin/bookkeeper shell bookieformat -nonInteractive -force -deleteCookie /bookkeeper/bookkeeper-server/bin/bookkeeper autorecovery
Use them, and replace the default [CMD] when you wanted to do things other than start a bookie.
View license information for the software contained in this image.