Public | Automated Build

Last pushed: a day ago
Short Description
Official build of GoAccess
Full Description

GoAccess

What is it?

GoAccess is an open source real-time web log analyzer and interactive viewer
that *runs in a terminal in nix systems or through your browser. It
provides fast and valuable HTTP statistics for system administrators that
require a visual server report on the fly.
More info at: http://goaccess.io.


Features

GoAccess parses the specified web log file and outputs the data to the X
terminal. Features include:

  • Completely Real Time
    All panels and metrics are timed to be updated every 200 ms on the terminal
    output and every second on the HTML output.

  • No configuration needed
    You can just run it against your access log file, pick the log format and
    let GoAccess parse the access log and show you the stats.

  • Track Application Response Time
    Track the time taken to serve the request. Extremely useful if you want to
    track pages that are slowing down your site.

  • Nearly All Web Log Formats
    GoAccess allows any custom log format string. Predefined options include,
    Apache, Nginx, Amazon S3, Elastic Load Balancing, CloudFront, etc

  • Incremental Log Processing
    Need data persistence? GoAccess has the ability to process logs incrementally
    through the on-disk B+Tree database.

  • Only one dependency
    GoAccess is written in C. To run it, you only need ncurses as a dependency.
    That's it. It even features its own Web Socket server - http://gwsocket.io/.

  • Visitors
    Determine the amount of hits, visitors, bandwidth, and metrics for slowest
    running requests by the hour, or date.

  • Metrics per Virtual Host
    Have multiple Virtual Hosts (Server Blocks)? A panel that displays which
    virtual host is consuming most of the web server resources.

  • Color Scheme Customizable
    Tailor GoAccess to suit your own color taste/schemes. Either through the
    terminal, or by simply applying the stylesheet on the HTML output.

  • Support for large datasets
    GoAccess features an on-disk B+Tree storage for large datasets where it is not
    possible to fit everything in memory.

  • Docker support
    Ability to build GoAccess' Docker image from upstream which listens for Web
    Socket connections on port 7890. You can still fully configure it, by using
    Volume mapping and editing goaccess.conf. See
    Docker section below.

Nearly all web log formats...

GoAccess allows any custom log format string. Predefined options include, but
not limited to:

  • Amazon CloudFront (Download Distribution).
  • Amazon Simple Storage Service (S3)
  • AWS Elastic Load Balancing
  • Combined Log Format (XLF/ELF) Apache | Nginx
  • Common Log Format (CLF) Apache
  • Google Cloud Storage.
  • Apache virtual hosts
  • Squid Native Format.
  • W3C format (IIS).

Why GoAccess?

GoAccess was designed to be a fast, terminal-based log analyzer. Its core idea
is to quickly analyze and view web server statistics in real time without
needing to use your browser (great if you want to do a quick analysis of your
access log via SSH, or if you simply love working in the terminal
).

While the terminal output is the default output, it has the capability to
generate a complete real-time HTML
report, as well as a JSON, and
CSV report.

You can see it more of a monitor command tool than anything else.

Installation

GoAccess can be compiled and used on *nix systems.

Download, extract and compile GoAccess with:

$ wget http://tar.goaccess.io/goaccess-1.2.tar.gz
$ tar -xzvf goaccess-1.2.tar.gz
$ cd goaccess-1.2/
$ ./configure --enable-utf8 --enable-geoip=legacy
$ make
# make install

Build from GitHub (Development)

$ git clone https://github.com/allinurl/goaccess.git
$ cd goaccess
$ autoreconf -fiv
$ ./configure --enable-utf8 --enable-geoip=legacy
$ make
# make install

Distributions

It is easiest to install GoAccess on Linux using the preferred package manager
of your Linux distribution. Please note that not all distributions will have
the lastest version of GoAccess available

Debian/Ubuntu

# apt-get install goaccess

NOTE: It is likely this will install an outdated version of GoAccess. To
make sure that you're running the latest stable version of GoAccess see
alternative option below.

Official GoAccess Debian & Ubuntu repository

$ echo "deb http://deb.goaccess.io/ $(lsb_release -cs) main" | sudo tee -a /etc/apt/sources.list.d/goaccess.list
$ wget -O - http://deb.goaccess.io/gnugpg.key | sudo apt-key add -
$ sudo apt-get update
$ sudo apt-get install goaccess

Note:

  • For on-disk support (Trusty+ or Wheezy+), run: sudo apt-get install goaccess-tcb
  • .deb packages in the official repo are available through https as well. You may need to install apt-transport-https.

Fedora

# yum install goaccess

Arch Linux

# pacman -S goaccess

Gentoo

# emerge net-analyzer/goaccess

OS X / Homebrew

# brew install goaccess

FreeBSD

# cd /usr/ports/sysutils/goaccess/ && make install clean
# pkg install sysutils/goaccess

OpenBSD

# cd /usr/ports/www/goaccess && make install clean
# pkg_add goaccess

OpenIndiana

# pkg install goaccess

pkgsrc (NetBSD, Solaris, SmartOS, ...)

# pkgin install goaccess

Windows

GoAccess can be used in Windows through Cygwin.
See Cygwin's <a href="https://goaccess.io/faq#installation">packages</a>.

Docker

NOTE: The following example assumes you will store your GoAccess data below
/srv/goaccess, but you can use a different prefix if you like or if you run
as non-root user.

mkdir -p /srv/goaccess/{data,html}

Before running your own GoAccess Docker container, first create a config file
in /srv/goaccess/data. You can start one from scratch or use the one from
config/goaccess.conf
as a starting point and change it as needed.

A minimal config file for a real-time HTML report requires at least the
following options: log-format, log-file, output and real-time-html. For
example, for Apache's combined log format:

log-format COMBINED
log-file /srv/logs/access.log
output /srv/report/index.html
real-time-html true

Once you have your configuration file all set, clone the repo:

$ git clone https://github.com/allinurl/goaccess.git goaccess && cd $_

and then build and run the image as follows:

docker build . -t allinurl/goaccess
docker run --restart=always -d -p 7890:7890 \
  -v "/srv/goaccess/data:/srv/data"         \
  -v "/srv/goaccess/html:/srv/report"       \
  -v "/var/log/apache2:/srv/logs"           \
  --name=goaccess allinurl/goaccess

If you you made changes to the config file after building the image, you don't
have to rebuild from scratch. Simply restart the container:

docker restart goaccess

If you want to expose goaccess on a different port on the host machine, you
have to set the ws-url option in the config file, e.g.:`

ws-url ws://example.com:8080

or for secured connections:

ws-url wss://example.com:8080

And start the container as follows:

docker run --restart=always -d -p 8080:7890 \
  -v "/srv/goaccess/data:/srv/data"         \
  -v "/srv/goaccess/html:/srv/report"       \
  -v "/var/log/apache:/srv/logs"            \
  --name=goaccess allinurl/goaccess

If you had already run the container, you may have to stop and remove it first:

docker stop goaccess
docker rm goaccess

Note, it is possible to specify a different command and command line options to
run in the container directly on the docker command line, e.g.:

docker run --restart=always -d -p 8080:7890 \
  -v "/srv/goaccess/data:/srv/data"         \
  -v "/srv/goaccess/html:/srv/report"       \
  -v "/var/log/apache:/srv/logs"            \
  --name=goaccess allinurl/goaccess         \
  goaccess --no-global-config --config-file=/srv/data/goaccess.conf  \
           --ws-url=example.org:8080 --output=/srv/report/index.html \
           --log-file=/srv/logs/access.log

The container and image can be completely removed as follows:

docker stop goaccess
docker rm goaccess
docker rmi allinurl/goaccess

There is also a prebuilt docker
image
that can be run without
cloning the git repository:

docker run --restart=always -d -p 8080:7890 \
  -v "/srv/goaccess/data:/srv/data"         \
  -v "/srv/goaccess/logs:/srv/logs"         \
  -v "/srv/goaccess/html:/srv/report"       \
  --name=goaccess allinurl/goaccess

Storage

There are three storage options that can be used with GoAccess. Choosing one
will depend on your environment and needs.

Default Hash Tables

In-memory storage provides better performance at the cost of limiting the
dataset size to the amount of available physical memory. By default GoAccess
uses in-memory hash tables. If your dataset can fit in memory, then this will
perform fine. It has very good memory usage and pretty good performance.

Tokyo Cabinet On-Disk B+ Tree

Use this storage method for large datasets where it is not possible to fit
everything in memory. The B+ tree database is slower than any of the hash
databases since data has to be committed to disk. However, using an SSD greatly
increases the performance. You may also use this storage method if you need
data persistence to quickly load statistics at a later date.

Tokyo Cabinet On-Memory Hash Database

An alternative to the default hash tables. It uses generic typing and thus it's
performance in terms of memory and speed is average.

Command Line / Config Options

See options that can be supplied to the command or
specified in the configuration file. If specified in the configuration file, long
options need to be used without prepending --.

Examples

Please note that piping data into GoAccess won't prompt a log/date/time
configuration dialog, you will need to previously define it in your
configuration file or in the command line.

DIFFERENT OUTPUTS

To output to a terminal and generate an interactive report:

# goaccess access.log

To generate an HTML report:

# goaccess access.log -a > report.html

To generate a JSON report:

# goaccess access.log -a -d -o json > report.json

To generate a CSV file:

# goaccess access.log --no-csv-summary -o csv > report.csv

GoAccess also allows great flexibility for real-time filtering and parsing. For
instance, to quickly diagnose issues by monitoring logs since goaccess was
started:

# tail -f access.log | goaccess -

And even better, to filter while maintaining opened a pipe to preserve
real-time analysis, we can make use of tail -f and a matching pattern tool
such as grep, awk, sed, etc:

# tail -f access.log | grep -i --line-buffered 'firefox' | goaccess --log-format=COMBINED -

or to parse from the beginning of the file while maintaining the pipe opened
and applying a filter

# tail -f -n +0 access.log | grep -i --line-buffered 'firefox' | goaccess -o report.html --real-time-html -
MULTIPLE LOG FILES

There are several ways to parse multiple logs with GoAccess. The simplest is to
pass multiple log files to the command line:

# goaccess access.log access.log.1

It's even possible to parse files from a pipe while reading regular files:

# cat access.log.2 | goaccess access.log access.log.1 -

Note that the single dash is appended to the command line to let GoAccess
know that it should read from the pipe.

Now if we want to add more flexibility to GoAccess, we can do a series of
pipes. For instance, if we would like to process all compressed log files
access.log.*.gz in addition to the current log file, we can do:

# zcat access.log.*.gz | goaccess access.log -

Note: On Mac OS X, use gunzip -c instead of zcat.

REAL TIME HTML OUTPUT

GoAccess has the ability the output real-time data in the HTML report. You can
even email the HTML file since it is composed of a single file with no external
file dependencies, how neat is that!

The process of generating a real-time HTML report is very similar to the
process of creating a static report. Only --real-time-html is needed to make
it real-time.

# goaccess access.log -o /usr/share/nginx/html/your_site/report.html --real-time-html

By default, GoAccess will use the host name of the generated report.
Optionally, you can specify the URL to which the client's browser will connect
to. See http://goaccess.io/faq for a more detailed example.

# goaccess access.log -o report.html --real-time-html --ws-url=goaccess.io

By default, GoAccess listens on port 7890, to use a different port other than
7890, you can specify it as (make sure the port is opened):

# goaccess access.log -o report.html --real-time-html --port=9870

And to bind the WebSocket server to a different address other than 0.0.0.0, you
can specify it as:

# goaccess access.log -o report.html --real-time-html --addr=127.0.0.1

Note: To output real time data over a TLS/SSL connection, you need to use
--ssl-cert=<cert.crt> and --ssl-key=<priv.key>.

WORKING WITH DATES

Another useful pipe would be filtering dates out of the web log

The following will get all HTTP requests starting on 05/Dec/2010 until the
end of the file.

# sed -n '/05\/Dec\/2010/,$ p' access.log | goaccess -a -

or using relative dates such as yesterdays or tomorrows day:

# sed -n '/'$(date '+%d\/%b\/%Y' -d '1 week ago')'/,$ p' access.log | goaccess -a -

If we want to parse only a certain time-frame from DATE a to DATE b, we can do:

# sed -n '/5\/Nov\/2010/,/5\/Dec\/2010/ p' access.log | goaccess -a -
VIRTUAL HOSTS

Assuming your log contains the virtual host field. For instance:

vhost.io:80 8.8.4.4 - - [02/Mar/2016:08:14:04 -0600] "GET /shop HTTP/1.1" 200 615 "-" "Googlebot-Image/1.0"

And you would like to append the virtual host to the request in order to see
which virtual host the top urls belong to

awk '$8=$1$8' access.log | goaccess -a -

To exclude a list of virtual hosts you can do the following:

# grep -v "`cat exclude_vhost_list_file`" vhost_access.log | goaccess -
FILES & STATUS CODES

To parse specific pages, e.g., page views, html, htm, php, etc. within a
request:

# awk '$7~/\.html|\.htm|\.php/' access.log | goaccess -

Note, $7 is the request field for the common and combined log format,
(without Virtual Host), if your log includes Virtual Host, then you probably
want to use $8 instead. It's best to check which field you are shooting for,
e.g.:

# tail -10 access.log | awk '{print $8}'

Or to parse a specific status code, e.g., 500 (Internal Server Error):

# awk '$9~/500/' access.log | goaccess -
SERVER

Also, it is worth pointing out that if we want to run GoAccess at lower
priority, we can run it as:

# nice -n 19 goaccess -f access.log -a

and if you don't want to install it on your server, you can still run it from
your local machine:

# ssh root@server 'cat /var/log/apache2/access.log' | goaccess -a -
INCREMENTAL LOG PROCESSING

GoAccess has the ability to process logs incrementally through the on-disk
B+Tree database. It works in the following way:

  1. A data set must be persisted first with --keep-db-files, then the same
    data set can be loaded with --load-from-disk.
  2. If new data is passed (piped or through a log file), it will append it to
    the original data set.
  3. To preserve the data at all times, --keep-db-files must be used.
  4. If --load-from-disk is used without --keep-db-files, database files will
    be deleted upon closing the program.
Examples
// last month access log
# goaccess access.log.1 --keep-db-files

then, load it with

// append this month access log, and preserve new data
# goaccess access.log --load-from-disk --keep-db-files

To read persisted data only (without parsing new data)

# goaccess --load-from-disk --keep-db-files

Contributing

Any help on GoAccess is welcome. The most helpful way is to try it out and give
feedback. Feel free to use the Github issue tracker and pull requests to
discuss and submit code changes.

Enjoy!

Docker Pull Command
Owner
allinurl
Source Repository

Comments (3)
ghostjester
5 months ago

Hi, thank you so much for the docker image. It works very well and is very lightweight. I have one issue, when I try to enable geo-location it returns and error:
goaccess: unrecognized option: std-geoip

Was the docker image compiled with the --enable-geoip ?

Am I doing something wrong?

Thank you

Richard

davask
5 months ago

Hi,

running:

docker run --name=goaccess --restart=always -d \
-p 65533:8080 \
-v /home/user/conf/goaccess.conf:/etc/goaccess.conf \
-v /home/user/data:/srv/data \
allinurl/goaccess

Will make goaccess to run properly, however I have a 400 error on http://<localhostip>:65533/

What do you use to serve http request and where is located the DocumentRoot directory ?

thanks

davask
5 months ago

Hi,

running:


docker run --name=goaccess --restart=always -d \
    -p 65533:8080 \
    -v /home/user/data:/srv/data \
    allinurl/goaccess

with /home/user/data/goaccess.conf, the file goaccess.conf is not used even if supervisord run this line:

/usr/bin/goaccess --addr=0.0.0.0 --port=8080 --no-global-config --config-file=/srv/data/goaccess.conf

What can I do ?