Public | Automated Build

Last pushed: 2 years ago
Short Description
IBM Watson speech-to-text service python client based on alpine image
Full Description


This project consists of a python client that interacts with the IBM Watson Speech To Text service through its WebSockets interface. The client streams audio to the STT service and receives recognition hypotheses in real time. It can run N simultaneous recognition sessions


There are some dependencies that need to be installed for this script to work. It is advisable to install the required packages in a separate virtual environment. Certain packages have been observed to conflict with the package requirements for this script; in particular the package nose conflicts with these required packages. In order to interact with the STT service via WebSockets, it is necessary to install pip, then write the following commands:

pip install -r requirements.txt

You also may need to write this command

$ apt-get install build-essential python-dev

If you are creating an environment using anaconda, proceed with the above pip command to install the packages--do not use conda to install the requirements as conda will install nose as a dependency.


The example below will run the default 10 WAV files through the WebSockets interface of the Speech To Text (STT) service and will dump the recognition hypotheses to a file under the "./output" directory.

$ python ./ -credentials <username>:<password> -model en-US_BroadbandModel

The example below performs the same task much faster by opening 10 simultaneous recognition sessions (WebSocket connections) against the STT service.

$ python ./ -credentials <username>:<password> -model en-US_BroadbandModel -threads 10


To see the list of available options type:

$ python -h


This script has been created by Daniel Bolanos in order to facilitate and promote the utilization of the IBM Watson Speech To Text service.


This repository includes a Dockerfile for building a docker container with the client and dependencies.

$ docker run --rm -it asssaf/watson-sttclient

You can bind mount a host volume with the recordings into the container using something like:

$ docker run --rm -it -v <recordings-dir>:/work asssaf/watson-sttclient -credentials <your-api-key> -in /work/recordings.txt -out /work/output -type audio/flac

This will mount <recordings-dir> from the host as /work and generate the output in <recordings-dir>/output/.
In this case it will look for <recording-dir>/recordings.txt with content such as:

Docker Pull Command