arm64v8/storm

By arm64v8

Updated 17 days ago

Apache Storm is a free and open source distributed realtime computation system.

Image
Data Science
0

10K+

Note: this is the "per-architecture" repository for the arm64v8 builds of the storm official image -- for more information, see "Architectures other than amd64?" in the official images documentation and "An image's source changed in Git, now what?" in the official images FAQ.

Quick reference

Supported tags and respective Dockerfile links

arm64v8/storm build status badge

Quick reference (cont.)

What is Apache Storm?

Apache Storm is a free and open source distributed realtime computation system. Apache Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Apache Storm is simple, can be used with any programming language, and is a lot of fun to use!

Apache Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Apache Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.

Apache Storm integrates with the queueing and database technologies you already use. An Apache Storm topology consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of the computation however needed.

logo

How to use this image

Running topologies in local mode

Assuming you have topology.jar in the current directory.

$ docker run -it -v $(pwd)/topology.jar:/topology.jar arm64v8/storm storm jar /topology.jar org.apache.storm.starter.ExclamationTopology

Setting up a minimal Storm cluster

  1. Apache Zookeeper is a must for running a Storm cluster. Start it first. Since the Zookeeper "fails fast" it's better to always restart it.

    $ docker run -d --restart always --name some-zookeeper zookeeper
    
  2. The Nimbus daemon has to be connected with the Zookeeper. It's also a "fail fast" system.

    $ docker run -d --restart always --name some-nimbus --link some-zookeeper:zookeeper arm64v8/storm storm nimbus
    
  3. Finally start a single Supervisor node. It will talk to the Nimbus and Zookeeper.

    $ docker run -d --restart always --name supervisor --link some-zookeeper:zookeeper --link some-nimbus:nimbus arm64v8/storm storm supervisor
    
  4. Now you can submit a topology to our cluster.

    $ docker run --link some-nimbus:nimbus -it --rm -v $(pwd)/topology.jar:/topology.jar arm64v8/storm storm jar /topology.jar org.apache.storm.starter.WordCountTopology topology
    
  5. Optionally, you can start the Storm UI.

    $ docker run -d -p 8080:8080 --restart always --name ui --link some-nimbus:nimbus arm64v8/storm storm ui
    

... via docker-compose or docker stack deploy

Example docker-compose.yml for storm:

version: '3.1'

services:
  zookeeper:
    image: zookeeper
    container_name: zookeeper
    restart: always

  nimbus:
    image: storm
    container_name: nimbus
    command: storm nimbus
    depends_on:
      - zookeeper
    links:
      - zookeeper
    restart: always
    ports:
      - 6627:6627

  supervisor:
    image: storm
    container_name: supervisor
    command: storm supervisor
    depends_on:
      - nimbus
      - zookeeper
    links:
      - nimbus
      - zookeeper
    restart: always

Try in PWD

Run docker stack deploy -c stack.yml storm (or docker compose -f stack.yml up) and wait for it to initialize completely. The Nimbus will be available at http://swarm-ip:6627, http://localhost:6627, or http://host-ip:6627 (as appropriate).

Configuration

This image uses default configuration of the Apache Storm. There are two main ways to change it.

  1. Using command line arguments.

    $ docker run -d --restart always --name nimbus arm64v8/storm storm nimbus -c storm.zookeeper.servers='["zookeeper"]'
    
  2. Assuming you have storm.yaml in the current directory you can mount it as a volume.

    $ docker run -it -v $(pwd)/storm.yaml:/conf/storm.yaml arm64v8/storm storm nimbus
    

Logging

This image uses default logging configuration. All logs go to the /logs directory by default.

Data persistence

No data are persisted by default. For convenience there are /data and /logs directories in the image owned by storm user. Use them accordingly to persist data and logs using volumes.

$ docker run -it -v /logs -v /data arm64v8/storm storm nimbus

Please be noticed that using paths other than those predefined is likely to cause permission denied errors. It's because for security reasons the Storm is running under the non-root storm user.

License

Apache Storm, Storm, Apache, the Apache feather logo, and the Apache Storm project logo are trademarks of The Apache Software Foundation.

Licensed under the Apache License, Version 2.0.

See license information.

As with all Docker images, these likely also contain other software which may be under other licenses (such as Bash, etc from the base distribution, along with any direct or indirect dependencies of the primary software being contained).

Some additional license information which was able to be auto-detected might be found in the repo-info repository's storm/ directory.

As for any pre-built image usage, it is the image user's responsibility to ensure that any use of this image complies with any relevant licenses for all software contained within.

Docker Pull Command

docker pull arm64v8/storm