Public Repository

Last pushed: a year ago
Short Description
PDTB-Style Shallow Discourse Parser
Full Description

Description

This repository contains a docker image of the oslopots-parser---a PDTB-style shallow discourse parser, which was the winner of the CoNLL-2016 Shared Task competition. A detailed description of this parsing system can be found in the official submission paper

Download

In order to retrieve the image, you need to install the docker service on your machine and subsequently execute the following command:

docker pull oslopots/oslopots-conll-2016

Running the Parser

Once the download has finished, you should launch a container from the downloaded image:

docker run --name=oslopots -itd oslopots/oslopots-conll-2016

Then, get the id of the newly created container:

OSLOPOTS_ID="$(docker ps -qf name=oslopots)"

Finally, to start the parsing, you need to copy the folder containing your data from your local machine into the running container, execute the parsing command, and copy its output back to your computer. Below is an example of how to do that:

docker cp en.dev $OSLOPOTS_ID:/opt/en.dev

docker exec $OSLOPOTS_ID ./shell/pdtb_parser /opt/en.dev . /opt/en.out

docker cp $OSLOPOTS_ID:/opt/en.out en.out

The output folder will contain a json file called output.json with the results of the parsing system:

head -2 en.out/output.json 
{"SentenceId": 2, "DocID": "wsj_2276", "Arg1": {"TokenList": [15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57]}, "Arg2": {"TokenList": [59, 60, 61]}, "Connective": {"TokenList": []}, "Sense": ["EntRel"], "Type": "Implicit", "ID": 1}
{"SentenceId": 4, "DocID": "wsj_2276", "Arg1": {"TokenList": [63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]}, "Arg2": {"TokenList": [92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121]}, "Connective": {"TokenList": []}, "Sense": ["EntRel"], "Type": "Implicit", "ID": 2}

Please note that the input folder (the one you copy first inside the container) should contain data in the CoNLL Shared Task format.

Contact

In case of technical troubles, feel free to contact the members of our team.

Docker Pull Command
Owner
oslopots