Public | Automated Build

Last pushed: 4 months ago
Short Description
A Distributed Database Template using Cassandra & Titan on Alpine inside Docker
Full Description

cassandraDB

A Distributed Database Template using Cassandra & Titan on Alpine inside Docker
by Collective Acuity

Features

  • Cassandra in a Container
  • Local Credential Controls
  • Lean Footprint
  • Configuration through cassandra.yaml
  • Auto-Configuration of IP Addresses from OS
  • Maps Data and Logs to System Host
  • EC2 Ready for Deployment
  • Python 2.7 Enabled
  • Python 3.5+ Enabled

Setup DevEnv

  1. Install Docker on Local Device
  2. Install Git on Local Device
  3. Clone/Fork Repository
  4. Install pocketlab: pip install pocketlab
  5. Run lab init in root folder
  6. Update Placeholder Credentials in /cred folder

Launch Commands

Test with Docker locally:

lab start
python connect.py

Deploy to EC2:

lab connect ec2
sudo yum install -y python36 
sudo yum install -y python36-virtualenv
sudo yum install -y python36-pip
python3 -m venv lab
source lab/bin/activate
pip install labpack 
pip install pocketlab
git clone https://bitbucket.org/collectiveacuity/cassandradb
cd cassandradb
lab init
docker pull collectiveacuity/cassandradb
lab start

Rebuild Image:

sh rebuild.sh

Authentication & Authorization

The settings for authentication and authorization for cassandra have been adjusted in cassandra.yaml to require authentication. To login to a fresh cassandra container use the following credentials:

docker exec -it cass sh
cqlsh -u cassandra -p cassandra

Once, logged in, a new admin username and password should be chosen and the previous cassandra user should be disabled. If access for non-admin users is needed, then users should be created and their permission to create/alter roles (and perform certain database functions) should be revoked. Information on how to create/alter users and grant/revoke permissions can be found on the Apache Cassandra Documentation.

Change Authentication:

docker exec -it cass sh
cqlsh -u cassandra -p cassandra
CREATE ROLE dba WITH SUPERUSER = true AND LOGIN = true AND PASSWORD = 'mysecretpassword';
exit
cqlsh -u dba -p mysecretpassword
ALTER ROLE cassandra WITH SUPERUSER = false AND LOGIN = false;
exit
exit

SSL Encryption

To connect to cassandra securely outside a VPC, it is important to set client_encryption_options.enable = true in cassandra.yaml and create an SSL keystore and root SSL certificate to manage the encrypted communication. Instructions for using shell to create keys can be found in SSL.md or you can use the following python script to generate a fresh set of keys.

python generate.py 123.456.789.0

Once generated, any remote client connecting to Cassandra will be required to use the root.crt in their request.

import ssl
from cassandra.cluster import Cluster
cassandra_cluster = Cluster(
    contact_points=['123.456.789.0'],
    ssl_options={
        'ca_certs': 'keys/root.crt',
        'cert_reqs': ssl.CERT_REQUIRED,
        'ssl_version': ssl.PROTOCOL_TLSv1
    }
)

SSL for Multiple-Nodes:
To securely communication between multiple Cassandra nodes using SSL, each node needs to have its own certificate and each should be signed using the same root certificate. Each certificate needs to be added to a truststore file which the nodes use to verify the identity of all other nodes.

python generate.py 123.456.789.1

PLEASE NOTE: CQLSH (and hence cassandra-driver) is not setup to verify certificates using a keystore or cert chain. Therefore, setting require_client_auth = true will break any clients connecting with CQLSH (or cassandra-driver)

Components

  • Alpine Latest (OS)
  • Java Runtime Environment 1.8.0_66 (Environment)
  • Cassandra 3.0.15 (Database)
  • Titan (Graph Database) TODO
  • Gremlin (Titan Interpreter) TODO

Dev Env

  • Docker (Provisioning)
  • BitBucket (Version Control)
  • Cassandra (Record Database)
  • PyCharm (IDE)
  • Dropbox(Collaboration, Backup)

Languages

  • Python 2.7
  • Python 3.5+
  • Regex
  • Shell Script

Python Client Frameworks

Collaboration Notes

The Git and Docker repos contain all the configuration information required for collaboration except access tokens. To synchronize access tokens across multiple devices, platforms and users without losing local control, you can use LastPass, an encrypted email platform such as ProtonMail or smoke signals. If you use any AWS services, use AWS IAM to assign user permissions and create keys for each collaborator individually. Collaborators are required to install all service dependencies on their local device if they wish to test code on their localhost. A collaborate should always FORK the repo from the main master and fetch changes from the upstream repo so reality is controlled by one admin responsible for approving all changes. New dependencies should be added to the Dockerfile, NOT to the repo files. Collaborators should test changes to Dockerfile locally before making a pull request to merge any new dependencies:

docker build -t test-image .

.gitignore and .dockerignore have already been installed in key locations. To prevent unintended file proliferation through version control & provisioning, add/edit .gitignore and .dockerignore to include all new:

  1. local environments folders
  2. localhost dependencies
  3. configuration files with credentials and local variables

Cassandra Documentation

Docker Pull Command
Owner
collectiveacuity
Source Repository