Public | Automated Build

Last pushed: 2 days ago
Short Description
Modified version of https://hub.docker.com/r/simleo/pyfeatures/ for testing purposes only
Full Description

pydoop-features

What

Pydoop-features is a suite of tools for extracting features from image
data. It uses
Bio-Formats
to read image data, Avro for
(de)serialization and
WND-CHARM for feature
calculation.

How

The fastest way to get a working installation is to pull the
Docker image:

docker pull imagedata/pyfeatures

Java-Python interoperability is achieved via Avro. The input dataset
can be in any format supported by
Bio-Formats
. For
instance, download
MF-2CH-Z-T
and unpack it under /tmp. The first step is to serialize this data
to Avro:

docker run -u ${UID} --rm -v /tmp:/tmp imagedata/pyfeatures \
  serialize /tmp/MF-2CH-Z-T.tif -o /tmp/

You should get one avro container file per image series in the input
dataset. In this case:

/tmp/MF-2CH-Z-T_{0,1,2,3,4}.avro

To compute features for the first avro container:

docker run -u ${UID} --rm -v /tmp:/tmp imagedata/pyfeatures \
  calc /tmp/MF-2CH-Z-T_0.avro -o /tmp/

You might want to get a cup of coffee, feature calculation takes time.

When the above finishes, you should have the following file:

/tmp/MF-2CH-Z-T_0_features.avro

which can be read from either Java or Python. For instance:

>>> from avro.datafile import DataFileReader
>>> from avro.io import DatumReader, BinaryDecoder
>>> with open("/tmp/MF-2CH-Z-T_0_features.avro") as f:
...     reader = DataFileReader(f, DatumReader())
...     records = [_ for _ in reader]
...
>>> len(records)
40
>>> r = records[0]
>>> r['haralick_textures']
[0.0015474594757607179, 0.00029323128834782644, ...]
Docker Pull Command
Owner
manics
Source Repository

Comments (0)