Public Repository

Last pushed: 4 months ago
Short Description
Origins batch capability
Full Description

Origins Batch

The batch container is a self contained command line tool. Run the following command to print a list of available options.
Launching the container requires a valid user token from an administrator

docker run --rm -it originsinfo/batch
Usage of batch.linux:
  -auth-token string
        origins token
  -data-directory string
        directory path to master reference data files (default "/originsdata")
  -data-extension string
        path to the extension file to use (default "/originsdata/extension.yaml")
  -data-geocoding string
        path to the geocoding file to use (default "/originsdata/geocoding.geoextension")
  -input-contextidentifier int
        zero based index of the context identifier column (default -1)
  -input-delimiter string
        column delimiter e.g (, or ;) (default ",")
  -input-file string
        path to the input file to process (default "/originsdata/sampleFile1.csv")
  -input-firstname int
        zero based index of the firstname column
  -input-identifier int
        zero based index of the identifier column (default 2)
  -input-lastname int
        zero based index of the lastname column (default 1)
  -output-filepath string
        destination file path for the results (default "/originsdata/sampleFile1.output.csv")
  -processing-contextprovider string
        context provider to use for this run
  -processing-coreonly int
        use the core database only
  -processing-geocoding int
        enable geocoding
  -processing-removeaccents int
        remove accents
  -processing-twopartnames int
        detect two part names
  -processing-view string
        processing view code (default "AAA")
  -processing-weight float
        the weight to use for this run (default 1.4)
  -stats-directory string
        directory path for statistics (default "/originsdata")

The container contains a default directory called "/originsdata" with default data, configuration and extension file.

Basic example - process a file in a folder

When running the container it is expected to provide an input file and an output file. Assuming an existing folder with a input file called inputfile.csv we need to create an empty output file (outputfile.csv) and map the folder directory to the batch container

  • Assuming the current directory structure

    ls -alh
    total 8.0K
    drwxr-xr-x. 2 core core   60 May 25 21:18 .
    drwxr-xr-x. 4 core core  160 May 25 20:56 ..
    -rw-r--r--. 1 core core 6.4K May 25 20:44 inputfile.csv
    
  • Create a empty output file

    touch outputfile.csv
    
  • Launch container and process file
    docker run --rm -it -v $PWD:/data originsinfo/batch --input-file /data/inputfile.csv --output-filepath /data/outputfile.csv --auth-token XXX
    

Override Extension - customized processing

If you have a custom extension you wish to use this must be mapped and communicated to the batch container. Simply copy our extension into the same folder as the input file.

  • Assuming the current directory structure

    ls -alh
    total 684K
    drwxr-xr-x. 2 core core  100 May 25 21:24 .
    drwxr-xr-x. 4 core core  140 May 25 21:24 ..
    -rw-r--r--. 1 core core 6.4K May 25 20:44 inputfile.csv
    -rw-r--r--. 1 core core 673K May 25 20:43 extension.yaml
    -rw-r--r--. 1 core core    0 May 25 21:24 outputfile.csv
    
  • Launch container, process file and specify custom extensions file

    docker run --rm -it -v $PWD:/data originsinfo/batch -data-extension /data/extension.yaml --input-file /data/inputfile.csv --output-filepath /data/outputfile.csv --auth-token XXX
    
Docker Pull Command
Owner
originsinfo

Comments (0)