Docker Image for Catmandu
How to use this docker image
Starting this Catmandu image is easy:
docker run -it librecat/catmandu
Now, you should be able to run this command in the docker terminal:
catmandu@d45e783d0bca:~$ catmandu help
Upgrade an existing docker image to the latest version:
docker pull librecat/catmandu
Start docker with access to your own files:
docker run -v C:\Users\yourname:/home/catmandu/Home -it librecat/catmandu
docker run -v /Users/yourname:/home/catmandu/Home -it librecat/catmandu
docker run -v /home/yourname:/home/catmandu/Home -it librecat/catmandu
Catmandu::Introduction - a Catmandu HOW TO
Catmandu is a data processing toolkit developed as part of the LibreCat project.
Catmandu provides a command line client and a suite of tools to ease the import, storage, retrieval,
export and transformation of data. For instance, to transform a CSV file into JSON use the
$ catmandu convert JSON to CSV < data.json
Or, to store a YAML file into an ElasticSearch database type (requires Catmandu::ElasticSearch):
$ catmandu import YAML to ElasticSearch --index_name demo < test.yml
To export all the data from an Solr search engine into JSON type (requires Catmandu::Solr):
$ catmandu export Solr --url http://localhost:8983/solr to JSON
With Catmandu one can import OAI-PMH records in your application (requires Catmandu::OAI):
$ catmandu convert OAI --url http://biblio.ugent.be/oai --set allFtxt
and export records into formats such as JSON, YAML, CSV, XLS, RDF and many more.
Catmandu also provides a small scripting language to manipulate data, extract parts of your dataset and
transform records. For instance, rename fields with the 'move_field' command:
$ catmandu convert JSON --fix 'move_field(title,my_title)' < data.json
In the example above, we renamed all the 'title' fields in the dataset into the 'my_title' field.
One can also work on deeply nested data. E.g. create a deeply nested data structure with the
$ catmandu convert JSON --fix 'move_field(title,my.deeply.nested.title)' < data.json
In this example we moved the field 'title' into the field 'my', which contains a (sub)field 'deeply',
which contains a (sub)field 'nested'.
Catmandu was created by librarians for librarians. We process a lot of metadata especially
library metadata in formats such as MARC, MAB2 and MODS. With the following command we can extract
data from a marc record and to store it into the title field (requires Catmandu::MARC):
$ catmandu convert MARC --fix 'marc_map(245,title)' < data.mrc
Or, in case only the 245a subfield is needed write:
$ catmandu convert MARC --fix 'marc_map(245a,title)' < data.mrc
When processing data a lot of Fix commands could be required. It wouldn't be very practical to
type them all on the command line. By creating a Fix script which contains all the fix commands complicated
data transformations can be created. For instance, if the file
marc_map(245a,title) marc_map(100a,author.$append) marc_map(700a,author.$append) marc_map(020a,isbn) replace_all(isbn,'[^0-9-]+','')
then they can be executed on a MARC file using this command:
$ catmandu convert MARC --fix myfixes.txt < data.mrc
Fixes can also be turned into executable scripts by adding a bash 'shebang' line at the top. E.g.
to harvest records from an OAI repository write this fix file:
#!/usr/bin/env catmandu run do importer(OAI,url:"http://lib.ugent.be/oai") add_to_exporter(.,JSON) end
Run this (on Linux) by setting the executable bit:
$ chmod 755 myfix.fix $ ./myfix.fix
To experiment with the Fix language you can also run the catmandu Fix interpreter in an
$ catmandu run Catmandu 0.95 interactive mode Type: \h for the command history fix > add_field(hello,world) --- hello: world ... fix >
Catmandu contains many powerfull fixes. Visit http://librecat.org/Catmandu/#fixes-cheat-sheet to get
an overview what is possible.
In the winter of 2014 an Advent calendar tutorial was created to provide a day by
day introduction into the UNIX command line and Catmandu:
If you need extra training, our developers regulary host workshops at library
conferences and events: http://librecat.org/events.html
There are several ways to get a working version of Catmandu on your computer.
For a quick and demo installation visit our blog
where a VirtualBox image is available containing all the Catmandu modules, including
ElasticSearch and MongoDB.
On our website we provide installation instructions for:
* Debian * Ubuntu Server * CentOS * openSUSE * OpenBSD * Windows
Catmandu software published at https://github.com/LibreCat/Catmandu is free software without warranty, liabilities
or support; you can redistribute it and/or modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 or any later version. Every contributor is free
to state her/his copyright.
Developers & Support
Catmandu has a very active international developer community. We welcome all feedback, bug reports and
Join our mailing list to receive more information:
Are a developer and want to contribute to the project? Feel free to submit pull requests or create new
Catmandu is created in a cooperation with many developers world wide. Without them this project isn't possible.
We would like to thank our core maintainer: Nicolas Steenlant and all contributors: Christian Pietsch ,
Dave Sherohman , Friedrich Summann , Jakob Voss , Johann Rolschewski , Jorgen Eriksson , Magnus Enger ,
Maria Hedberg , Mathias Loesch , Najko Jahn , Nicolas Franck , Patrick Hochstenbach , Petra Kohorst ,
Snorri Briem , Upasana Shukla and Vitali Peil