Public | Automated Build

Last pushed: 5 months ago
Short Description
EBTH activity to calcualte incremental bid value per impression
Full Description

aries-activity-ebth-bid-values-spark

EBTH activity to calcualte incremental bid value per impression

Pipeline: postgres-source (redshift) -> ebth-bid-values-spark -> redshift_sink

Example Config

{ 
    "_id" : "58bf0657cc8e4g2b4ge28h9k", 
    "accountId" : "L5u7DXQsYp8W3vDHH", 
    "_airflow" : true, 
    "schedule" : "@hourly", 
    "name" : "ebth_spark_bid_values", 
    "activityList" : [
        {
            "name" : "astronomerio/postgres-source", 
            "connection" : ObjectId("58b07b5a14c91f418bbac99b")
        }, 
        {
            "name" : "astronomerio/ebth-bid-values-spark", 
            "config" : {
                "awsAccessKey" : "aws_access_key", 
                "awsSecretKey" : "aws_secret_key", 
                "executionTimeout" : NumberInt(15)
            }
        }, 
        {
            "name" : "aries-activity-redshift-sink", 
            "version" : "1.0.0", 
            "config" : {
                "schema" : "stg_stream", 
                "table" : "bid_values", 
                "drop" : true, 
                "json" : true
            }, 
            "connection" : ObjectId("58b07b5a14391f41ebbac99b")
        }
    ]
}

The connection in the postgres-source activity above is special and should have a query property like so:

{ 
    "_id" : ObjectId("58b07b5a14c91f418bbac99b"), 
    "updatedAt" : ISODate("2017-02-24T18:28:42.982+0000"), 
    "createdAt" : ISODate("2017-02-24T18:28:42.982+0000"), 
    "name" : "astronomer-redshift-ebth-bids", 
    "code" : "redshift", 
    "details" : {
        "database" : "database", 
        "password" : "password", 
        "user" : "user", 
        "port" : NumberInt(5439), 
        "host" : "redshift_url", 
        "query" : "SELECT * FROM stg_stream.bid_values bv INNER JOIN ( SELECT id AS id_ref, min(processed_at) AS min_processed_at, max(processed_at) AS max_processed_at FROM stg_stream.bid_values GROUP BY 1 ) sbv ON bv.id = sbv.id_ref AND bv.processed_at = sbv.max_processed_at AND DATEDIFF(day, sbv.min_processed_at, sbv.max_processed_at) <= 10 AND bv.rate != 0"
    }, 
    "__v" : NumberInt(0)
}

Deployment Notes

The postgres-source activity expects a table with the name you specify in the config to exist. Just create a dummy table with any values and it will be replaced.

Docker Pull Command
Owner
astronomerio