Public | Automated Build

Last pushed: 2 months ago
Short Description
Tank is a high performance distributed log.
Full Description

Tank is a very high performance distributed log, inspired in part by Kafka, and other similar services and technologies.

Some Benchmarks

Single Producer
$> bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance  \
 --topic test --num-records 10000000 --record-size 100 \
 --throughput -1 --producer-props acks=1 bootstrap.servers=127.0.0.1:9092 
10000000 records sent, 913158.615652 records/sec (87.09 MB/sec), 2.44 ms avg latency, 167.00 ms max latency, 1 ms 50th, 12 ms 95th, 34 ms 99th, 39 ms 99.9th.
$> tank-cli  -b 127.0.0.1:11011 -t test  bm p2b  -s 100 -c 1000000 -B 8196 -R
Will publish 1,000,000 messages, in batches of 8,196 messages, each message content is 100b (compression disabled)
Go ACK after publishing 1,000,000 message(s) of size 100b (95.37mb), took 0.296s

For 1 million messages, without enabling compression(if you do, it will take upto 30% less time for Tank to complete this benchmark), it takes 1 second for Kafka, vs <0.3s for Tank.

Multiple Producers (3x)
1000000 records sent, 177967.609895 records/sec (16.97 MB/sec), 1029.78 ms avg latency, 2008.00 ms max latency, 989 ms 50th, 1902 ms 95th, 1930 ms 99th, 2007 ms 99.9th.
1000000 records sent, 176118.351532 records/sec (16.80 MB/sec), 1026.09 ms avg latency, 2003.00 ms max latency, 888 ms 50th, 1912 ms 95th, 1990 ms 99th, 2002 ms 99.9th.
1000000 records sent, 173550.850399 records/sec (16.55 MB/sec), 1096.54 ms avg latency, 1953.00 ms max latency, 1023 ms 50th, 1883 ms 95th, 1935 ms 99th, 1952 ms 99.9th.
Go ACK after publishing 1,000,000 message(s) of size 100b (95.37mb), took 0.413s
Go ACK after publishing 1,000,000 message(s) of size 100b (95.37mb), took 0.474s
Go ACK after publishing 1,000,000 message(s) of size 100b (95.37mb), took 0.519s

Produced with

#!/bin/bash
bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic test --num-records 1000000 --record-size 100 --throughput -1 --producer-props acks=1 bootstrap.servers=127.0.0.1:9092 &
bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic test --num-records 1000000 --record-size 100 --throughput -1 --producer-props acks=1 bootstrap.servers=127.0.0.1:9092 &
bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic test --num-records 1000000 --record-size 100 --throughput -1 --producer-props acks=1 bootstrap.servers=127.0.0.1:9092 &

and

#!/bin/bash
tank-cli  -b 127.0.0.1:11011 -t test  bm p2b  -s 100 -c 1000000 -B 8196 -R &
tank-cli  -b 127.0.0.1:11011 -t test  bm p2b  -s 100 -c 1000000 -B 8196 -R &
tank-cli  -b 127.0.0.1:11011 -t test  bm p2b  -s 100 -c 1000000 -B 8196 -R &

For Kafka, we get 177k messages/second/producer. For Tank, we get about 2mil messages/second/producer.

Benchmark environment details:
Ubuntu 16.04 LTS
Dell Poweredge R630
1x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz (6c/12th)
24G ram
Consumer grade SSD (samsung 850 PRO 512GB)
PERC H730 1GB flash cache controller
OpenJDK-8

Consumer benchmarks coming soon.

Introduction

You should begin by reading about the core concepts and the client API (A new Java Tank Client is now available).

It depends on our Switch library, so a lean/stripped-down Switch is included in the repo.
Please see building instructions. You may also want to run Tank using its Docker image.

This is our first major open source release as a company, and we plan to accelerate our OSS release efforts in the future.

It will eventually support, among other features:

  • clusters via leader/followers arrangement using etcd, similar in semantics to Kafka (but no single controller, and simpler configuration and operation)
  • higher level clients, based on Kafka's current client design (depending on the needs of our developers, but PRs will be welcome)
  • hooks into other Phaistos infrastructure
  • a Kafka/DataFlow like streams topologies abstraction/framework
  • encryption (wire transfers and bundle serialization)
  • improved client and extended API
  • HTTP/1 and HTTP/2 REST APIs

Features include:

You should probably use Kafka (the Confluent folk are particularly great), or Google Pub/Sub, or any other open source broker/queue instead of Tank - they are all perfectly fine, some more than others, if support for cluster-aware setups is crucial to you(this feature is in the works).

Tank's goal is highest performance and simplicity. If you need very high performance, operation simplicity and no reliance on other services (when running Tank in stand-alone mode), consider Tank.

Please see the wiki for more information.

We chose the name Tank because its a storage chamber, suitable for liquids and gas - which we think is analogous to a storage container for data that flows, from and to other containers and other systems via 'pipes' (connections).

Docker Pull Command
Owner
phaistos
Source Repository