Public Repository

Last pushed: 3 years ago
Short Description
Collect logs from Amazon Kinesis and store them to S3 with a worker.
Full Description

Logs collector for Amazon Kinesis and Amazon S3

Use this worker to collect logs from Amazon Kinesis and store them to your S3 Bucket automatically in a JSON format.
You can use this middleware to record your logs to Amazon Kinesis.

How it works

  1. A request is sent to Kinesis every (timeToSleep) seconds.
  2. If the amount of records is superior to (limitRecords), there are two cases:
    1. you don't have any records today, so a file named <currentDate>.json is locally created, and records are written in this file.
    2. you already have records today, so the file named <currentDate>.json is downloaded, and records are appended in this file.
  3. The file is uploaded to S3 into your (bucketName), in a JSON format.
  4. Each time records are stored into S3, the sequence number of the last record sent is stored into S3 too, so that the next records are read after this number.

How to use it ?

There are many flags:

  • streamName: Name of the stream in Kinesis
  • shardId: ID of the shard of the steam in Kinesis (default is "shardId-000000000000")
  • regionKinesis: Region for Kinesis stream (default is "eu-west-1")
  • bucketName: Name of the bucket in S3
  • regionS3: Region for S3 bucket (default is "eu-west-1")
  • awsAccessKey: Access key for AWS
  • awsSecretKey: Secret key for AWS
  • timeToSleep: Time to sleep (in seconds) before collecting new records from Kinesis (default is "10")
  • limitRecords: Limit of records over which send logs to S3 (default is "10", must be between "1" and "999")
  • startingSequenceNumber: Sequence number after which read the first bunch of records from Amazon Kinesis. This is the most important flag. If this is the first time you use the program, let it unset, a starting sequence number will be found. After the first use, always use the value "s3" to read the startingSequenceNumber from your Amazon S3 bucket, which is automatically stored each time records are sent to S3.

You can also launch a container using the -h flag to display the help:

docker run --rm aurelienmassiot/aws_kinesis_s3_logs_worker aws_kinesis_s3_logs_worker -h


Example 1

Use these flags to launch a container. For example:

docker run --name kinesis_s3_logs_1 aurelienmassiot/aws_kinesis_s3_logs_worker:latest aws_kinesis_s3_logs_worker -streamName=test-aurelien-kinesis -bucketName=aurelien-logs-kinesis -awsAccessKey=ANFLZDPZDPZFPZF -awsSecretKey=ASLzfdlzdp9ezf4zefzefzef915ezfezfezf

This will launch a worker with the provided credentials, stream name and bucket name, and with the default values for shardID, regions, timeToSleep and limitRecords.

In other words, this worker will automatically grab the records from Kinesis every 10 seconds and store them to S3 in a JSON format if the number of records is > 10.

A starting sequence number will automatically be found, as it is not changed in the commend line.

Example 2

You can also launch a container as a daemon (as you probably know), with the -d flag:

docker run -d --name kinesis_s3_logs_2 aurelienmassiot/aws_kinesis_s3_logs_worker:latest aws_kinesis_s3_logs_worker -streamName=test-aurelien-kinesis -bucketName=aurelien-logs-kinesis -awsAccessKey=ANFLZDPZDPZFPZF -awsSecretKey=ASLzfdlzdp9ezf4zefzefzef915ezfezfezf

You can check the logs of the running container using:
docker logs kinesis_s3_logs_2

Example 3

Finally, you can use the "s3" value for the startingSequenceNumber flag, after the first use, to automatically read the starting sequence number from your Amazon S3 bucket. It ensures that you don't read the previous records.

docker run -d --name kinesis_s3_logs_3 aurelienmassiot/aws_kinesis_s3_logs_worker:latest aws_kinesis_s3_logs_worker -streamName=test-aurelien-kinesis -bucketName=aurelien-logs-kinesis -awsAccessKey=ANFLZDPZDPZFPZF -awsSecretKey=ASLzfdlzdp9ezf4zefzefzef915ezfezfezf -timeToSleep=3 -startingSequenceNumber=s3

For more information about Docker commands, check this wonderful cheatsheet

Docker Pull Command