Public | Automated Build

Last pushed: 2 days ago
Short Description
Base image with heavy dependencies for pacman
Full Description

Pacman

Master:

Develop:

Pacman is a synchronous tool using multiprocessing to crunch bundles of data from sources and flow them into destination.

Ex: Pacman gets Adwords Reports (Bundles) from the API (Source) and put each line into Kafka (Destination)

Install prerequisite

Cryptography

sudo apt-get install build-essential libssl-dev libffi-dev python-dev

Snappy

sudo apt-get install libsnappy-dev

Confluent

wget -qO - http://packages.confluent.io/deb/3.0/archive.key | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] http://packages.confluent.io/deb/3.0 stable main"
sudo apt-get update && sudo apt-get install confluent-platform-2.11

Kafka

sudo git clone https://github.com/edenhill/librdkafka /tmp/librdkafka
cd /tmp/librdkafka
sudo ./configure --prefix=/usr
sudo make
sudo make install

Mysql connectors

sudo git clone https://github.com/mysql/mysql-connector-python.git /tmp/mysql
cd /tmp/mysql
sudo python setup.py build
sudo python setup.py install

HDFS3

sudo wget -P /tmp/libprotobuf8/http://security.ubuntu.com/ubuntu/pool/main/p/protobuf/libprotobuf8_2.5.0-9ubuntu1_amd64.deb
sudo dpkg -i /tmp/libprotobuf8/libprotobuf8_2.5.0-9ubuntu1_amd64.deb
sudo apt-get install -qq cmake libxml2 libxml2-dev uuid-dev protobuf-compiler libprotobuf-dev libkrb5-dev
sudo apt-get install libboost-all-dev -y
sudo apt-get -f install -y
sudo apt-get install -y apt-utils apt-transport-https
sudo apt-get install libgsasl7-dev gsasl -y
sudo wget -P /tmp/libhfds https://dl.bintray.com/wangzw/deb/dists/trusty/contrib/binary-amd64/libhdfs3_2.2.31-1_amd64.deb
sudo dpkg -i /tmp/libhfds/libhdfs3_2.2.31-1_amd64.deb
sudo wget -P /tmp/libhfds https://dl.bintray.com/wangzw/deb/dists/trusty/contrib/binary-amd64/libhdfs3-dev_2.2.31-1_amd64.deb
sudo dpkg -i /tmp/libhfds/libhdfs3-dev_2.2.31-1_amd64.deb

Redis

wget http://download.redis.io/redis-stable.tar.gz
tar xvzf redis-stable.tar.gz
cd redis-stable
make
sudo make install

Install requirements

pip install -r requirements/base.txt
pip install -r requirements/pacman.txt

Getting Started

Each pacman instance uses configuration files from the conf directory.

List of bundles readable :

List of destinations available :

List of sources connected :

Feel free to add your custom bundle/destination/source.

To launch a pacman instance, run the following command:

python bin/pacman-runner.py --config-file conf.json --job-configs jobs.d/

Deployment

This project uses Ansible for deployment. Pacman is a schedule task launched thanks to an hourly cron.

Playbooks are within the deployment folder. Client specific configuration are in the conf folder inside the client folder :
To launch the right playbook a command would looks like this :

ansible-playbook -i deployment/inventory -e ansible_user=ubuntu -e env=prod -e client=orange --vault-password-file ~/.vault_pass.txt deployment/pacman.yml

The password required by ansible is the canonical Artefact R&D password.

Tests

Steps to run tests :

1- Add "artefact.connectors" folder to your pythonpath

2- Install requirements from artefact.connectors repository

pip install -r /artefact.connectors/requirements.txt

3- Run tests from pacman folder

./test.sh

This script shell launches python unittests, EOF and PEP8 checking. After all, it deletes .pyc files.

Docker

  • To build : make docker_build
  • To run daemon : make docker_start
  • To stop daemon : make docker_stop
  • To run on container : make docker_test
Docker Pull Command
Owner
artefact
Source Repository

Comments (0)