Public | Automated Build

Last pushed: a year ago
Short Description
JSON log file parsing with SQL
Full Description

Tidalwave

JSON log file parsing with SQL

<a href="https://travis-ci.org/dustinblackman/tidalwave"><img src="https://img.shields.io/travis/dustinblackman/tidalwave.svg" alt="Build Status"></a> <a href="https://goreportcard.com/report/github.com/dustinblackman/tidalwave"><img src="https://goreportcard.com/badge/github.com/dustinblackman/tidalwave"></a> <img src="https://img.shields.io/github/release/dustinblackman/tidalwave.svg?maxAge=2592000">

Tidalwave is an awesomely fast command line, server, and client application for recording and parsing JSON logs. It's meant to be an alternative to application suites like ELK which can be rather resource hungry, where Tidalwave only consumes resources when a search is in progress. It's recorded at being 8 times faster than grep with more in depth parsing then simple regex matching.

With a built in API with sockets for live tail, as well as the command line, everything is queryable with SQL. Tidalwave works best with logging modules such as logrus, bunyan, slf4j, python-json-logger, json_logger or anything else that outputs JSON logs.

Tidalwave is in it's early stages where it's littered with TODOs, possible bugs, and all the other nifty things that come with early development.

Features / Roadmap

How it Works

Products like ELK work by having multiple layers process' to manage and query logs. Elastic search can get quite hungry, and 3rd party services that do something similar is just too expensive for small applications. Tidalwave works by having a folder and file structure that acts as an index, then matching those files to the given query. It only takes up resources on search by taking advantage of multi core systems to quickly parse large log files. Tidalwave is meant to be CPU intensive on queries, but remains on very low resources when idle.

The SQL parser can do basic math (==, !=, <=, >, ect) that works with strings, numbers, and date. Parsing multiple applications is as simple as (SELECT * FROM serverapp, clientapp). It can also truncate logs to reduce response size (SELECT time, line.cmd FROM serverapp).

date is a special work as you'll find in more time series applications that's used for. You can either pass a date (SELECT * FROM serverapp WHERE date = '2016-01-01'), or pass a full timestamp (SELECT * FROM serverapp WHERE date = '2016-01-01T01:30:00').

Example

Folder structure is sorted by application name, folder with date, then file names with datetime split by hour.

Folder Structure

.
+-- serverapp
|   +-- 2016-10-01
|   |   +-- 2016-10-01T01_00_00.log
|   |   +-- 2016-10-01T02_00_00.log
|   +-- 2016-10-02
|   |   +-- 2016-10-02T01_00_00.log
|   |   +-- 2016-10-02T02_00_00.log
|   |   +-- 2016-10-02T03_00_00.log
|   |   +-- 2016-10-02T04_00_00.log
|   |   +-- 2016-10-02T05_00_00.log
|   +-- 2016-10-03
+-- clientapp

2016-10-02T01_00_00.log was created by the Docker client logger, where the application was using Bunyan to output it's logs.

...
{"v":3,"id":"49aa6ad41125","image":"docker-image","name":"server","line":{"name":"server","hostname":"49aa6ad41125","pid":14,"level":30,"cmd":"lol","suffix":"status","msg":"cmd","time":"2016-10-02T00:04:25.172Z","v":0},"host":"a2197bfa39c7"}
{"v":0,"id":"49aa6ad41125","image":"docker-image","name":"server","line":{"name":"server","hostname":"49aa6ad41125","pid":14,"level":30,"cmd":"chat","suffix":"What time is it?","msg":"cmd","time":"2016-10-02T00:04:25.629Z","v":0},"host":"a2197bfa39c7"}
{"v":0,"id":"49aa6ad41125","image":"docker-image","name":"server","line":{"name":"server","hostname":"49aa6ad41125","pid":14,"level":30,"cmd":"chat","suffix":"Pizza.","msg":"cmd","time":"2016-10-02T00:04:33.164Z","v":0},"host":"a2197bfa39c7"}
{"v":0,"id":"49aa6ad41125","image":"docker-image","name":"server","line":{"name":"server","hostname":"49aa6ad41125","pid":14,"level":30,"cmd":"meme","suffix":"fry1 \"meme\"","msg":"cmd","time":"2016-10-02T00:04:35.811Z","v":0},"host":"a2197bfa39c7"}
{"v":0,"id":"49aa6ad41125","image":"docker-image","name":"server","line":{"name":"server","hostname":"49aa6ad41125","pid":14,"level":30,"cmd":"lol","suffix":"status","msg":"cmd","time":"2016-10-02T00:04:36.066Z","v":0},"host":"a2197bfa39c7"}
...

Querying all the lines where cmd equals chat within a set timeframe is as simple as querying a SQL database!

Query:

SELECT * FROM serverapp WHERE line.cmd = 'chat' and date <= '2016-10-02' and date > '2016-10-02T02:00:00'

Result:

{"v":0,"id":"49aa6ad41125","image":"docker-image","name":"server","line":{"name":"server","hostname":"49aa6ad41125","pid":14,"level":30,"cmd":"chat","suffix":"What time is it?","msg":"cmd","time":"2016-10-02T00:04:25.629Z","v":0},"host":"a2197bfa39c7"}
{"v":0,"id":"49aa6ad41125","image":"docker-image","name":"server","line":{"name":"server","hostname":"49aa6ad41125","pid":14,"level":30,"cmd":"chat","suffix":"Pizza.","msg":"cmd","time":"2016-10-02T00:04:33.164Z","v":0},"host":"a2197bfa39c7"}

Install

Grab the latest release from the releases page, or build from source and install directly from master. Tidalwave is currently built and tested against Go 1.7. A docker image is also available.

Quick install for Linux:

curl -Ls "https://github.com/dustinblackman/tidalwave/releases/download/0.0.2/tidalwave-linux-amd64-0.0.2.tar.gz" | tar xz -C /usr/local/bin/

Build From Source:

A makefile exists to handle all things needed to build and install from source.

git pull https://github.com/dustinblackman/tidalwave
cd tidalwave
make install

Usage/Configuration

Configuration can be done either by command line parameters, environment variables, or a JSON file. Please see all available flags with tidalwave --help.

To set a configuration, you can take the flag name and export it in your environment or save in one of the three locations for config files.

Examples

Flag:

tidalwave --client --max-parallelism 2

Environment:

export TIDALWAVE_CLIENT=true
export TIDALWAVE_MAX_PARALLELISM=2

JSON File:

Configuration files can be stored in one of the three locations

./tidalwave.json
/etc/tidalwave.json
$HOME/.tidalwave/tidalwave.json
{
  "client": true,
  "max-parallelism": 2
}
Docker Pull Command
Owner
dustinblackman
Source Repository