Public Repository

Last pushed: 3 years ago
Short Description
Annotate text with POS tags and lemma information
Full Description


This repository contains docker images to build and ship ready to use TreeTagger instances.

You will not have to manually install TreeTagger in your system again.

Detailed info here.

What it is

A tool for annotating text with part-of-speech - i.e., POS tagging - and lemma information.

Supported languages

17 languages are supported: bulgarian, dutch, english, estonian, finnish, french, galician, german, italian, latin, portuguese, polish, russian, slovak, spanish, swahili, mongolian (only parameter file provided, no scripts).

Some of them have also alternative parameter files.



Suppose you want to (tokenize and) tag an Italian text.

The script to use is tree-tagger-italian.

It expects UTF8 encoded input files as arguments. If no files have been specified, input from stdin is expected.

echo 'Proviamo semplicemente a eseguire un test di prova.' | \
          docker run --rm -i leodido/treetagger tree-tagger-italian


Proviamo         VER:pres       provare
semplicemente    ADV            semplicemente
a                PRE            a
eseguire         VER:infi       eseguire
un               DET:indef      un
test             NOM            test
di               PRE            di
prova            NOM            prova
.                SENT           .


Now, try with some Portuguese.

echo 'Qual é o seu nome?' | \
         docker run --rm -i leodido/treetagger tree-tagger-portuguese


Qual    PT0     qual
é       VM      ser
o       DA0     o
seu     DP3     seu
nome    NCMS    nome
?       Fit     ?

And so on for other supported languages.


Suppose you want to tokenize, tag and annotate a German text with nominal and verbal chunks.

echo 'Das ist ein Test.' | \
        docker run -i leodido/treetagger tagger-chunker-german

Which outputs:

Das        PDS        die
ist        VAFIN      sein
ein        ART        eine
Test       NN         Test
.          $.         .


This image is tested, built and pushed using CircleCI.

See the repository for further information about TreeTagger, about manual building, testing, and so on.


  • Helmut Schmid, University of Stuttgart, Germany - TreeTagger.

Last update: 28/05/2015

Docker Pull Command