sepia/stt-server

By sepia

•Updated over 3 years ago

Speech-To-Text (STT) Server for SEPIA Framework

Image

7.5K

Overview Tags

sepia/stt-server repository overview

⁠Open-Source SEPIA STT-Server

⁠Intro

SEPIA Speech-To-Text (STT) Server is a WebSocket based, full-duplex Python server for real-time automatic speech recognition (ASR) supporting multiple open-source ASR engines. It can receive a stream of audio chunks via the secure WebSocket connection and return transcribed text almost immediately as partial and final results.

One goal of this project is to offer a standardized, secure, real-time interface for all the great open-source ASR tools out there. The server works on all major platforms including single-board devices like Raspberry Pi (4).

Currently the supported engines are Vosk⁠ and Coqui⁠. Vosk comes together with small, but powerful ASR models for English and German and for Coqui there is a small English model (w/o scorer) included for experimentation. Official and custom ASR models for many languages can be added easily.

Language model adaptation tools are included as well, so you can start building custom domain models right away, using for example ZAMIA Speech⁠ (Kaldi ASR) models as starting point.

For more info visit: https://github.com/SEPIA-Framework/sepia-stt-server⁠

NOTE: This is a complete rewrite (2021) of the original STT Server (2018). If you are using ZAMIA Speech custom Kaldi models built for the 2018 version you can easily convert them to new models. Please see: https://github.com/fquirin/kaldi-adapt-lm⁠

⁠Getting started

Simply pull the latest image (or choose an older one form the archive):

docker pull sepia/stt-server:latest

Supported platforms:

ARM 32Bit (Raspberry Pi 4 32Bit OS)
ARM 64Bit (RPi 4 64Bit, Jetson Nano(?))
x86 64Bit Systems (Desktop PCs, Linux server etc.)

Start the server:

sudo docker run --rm --name=sepia-stt -p 20741:20741 -it sepia/stt-server:latest

Visit the test page: http://localhost:20741

Tag summary

Recent tags

Content type

Image

Digest

Size

316.1 MB

Last updated

over 3 years ago

Run in Docker Desktop

Requires Docker Desktop 4.37.1 or later.