onerahmet/openai-whisper-asr-webservice
https://github.com/ahmetoner/whisper-asr-webservice
500K+
Whisper ASR Box is a general-purpose speech recognition toolkit. Whisper Models are trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification.
Current release (v1.8.2) supports following whisper models:
docker run -d -p 9000:9000 \
-e ASR_MODEL=base \
-e ASR_ENGINE=openai_whisper \
onerahmet/openai-whisper-asr-webservice:latest
docker run -d --gpus all -p 9000:9000 \
-e ASR_MODEL=base \
-e ASR_ENGINE=openai_whisper \
onerahmet/openai-whisper-asr-webservice:latest-gpu
Cache
To reduce container startup time by avoiding repeated downloads, you can persist the cache directory:
docker run -d -p 9000:9000 \
-v $PWD/cache:/root/.cache/ \
onerahmet/openai-whisper-asr-webservice:latest
Key configuration options:
ASR_ENGINE
: Engine selection (openai_whisper, faster_whisper, whisperx)ASR_MODEL
: Model selection (tiny, base, small, medium, large-v3, etc.)ASR_MODEL_PATH
: Custom path to store/load modelsASR_DEVICE
: Device selection (cuda, cpu)MODEL_IDLE_TIMEOUT
: Timeout for model unloadingFor complete documentation, visit: https://ahmetoner.github.io/whisper-asr-webservice
# Install poetry
pip3 install poetry
# Install dependencies
poetry install
# Run service
poetry run whisper-asr-webservice --host 0.0.0.0 --port 9000
After starting the service, visit http://localhost:9000
or http://0.0.0.0:9000
in your browser to access the Swagger UI documentation and try out the API endpoints.
docker pull onerahmet/openai-whisper-asr-webservice