ai/chat-demo-model
Runtime environment for AI models deployed with Ollama based on ollama/ollama:0.4.0-rc8
3.2K
This Docker image is based on ollama/ollama:0.4.0-rc8
and serves as a runtime environment for AI models deployed with Ollama. The image supports real-time streaming chat responses and is pre-configured with tools for API health checks and model management.
This image is a component of the full AI Chat Application Demo - ai/chat-demo. More information about how to run the whole demo can be found on the ai/chat-demo image.
MODEL
: Specify the model to load, e.g., mistral:latest
.Run the Container
Use MODEL
to specify the AI model (default is mistral:latest
):
docker run -e MODEL=mistral:latest -p 11434:11434 ai/chat-demo-model:latest
The container automatically pulls and loads the specified model if it’s not already available.
Health Check and Model Management
The container includes a health check that waits for the Ollama server to start and verifies the specified model's presence. If the model isn’t found, it will be pulled automatically
The image is designed to integrate with applications that communicate with Ollama’s API for real-time AI interactions.
Endpoint: /chat/stream
Description: Streams chat responses in real time. Each message is concise and focused on coding and technical topics.
import httpx
async def chat_stream():
model_host = "http://localhost:11434"
request_data = {
"model": "mistral",
"messages": [{"role": "user", "content": "How can I create a Dockerfile for a FastAPI app?"}],
"stream": True,
"temperature": 0.7,
}
async with httpx.AsyncClient() as client:
async with client.stream("POST", f"{model_host}/api/chat", json=request_data) as response:
async for line in response.aiter_lines():
print(line) # Processes each line of the streaming response
Note: The Ollama model is optimized for code and technical responses, with concise answers and practical examples.
MODEL
: Model name to use (default is mistral:latest
), e.g., MODEL=gpt-4:latest
.MODEL
environment variable is set correctly. The startup script will pull the model if it's unavailable.http://localhost:11434
.docker pull ai/chat-demo-model