jasonacox/chatbot

By jasonacox

Updated 7 days ago

TinyLLM Web Based Chatbot

Image
Machine Learning & AI

2.2K

TinyLLM Web Based Chatbot

Chatbot: Chatbot

The TinyLLM Chatbot is a web based python flask app that allows you to chat with a LLM using the OpenAI API.

The intent of this project is to build and interact with a locally hosted LLM using consumer grade hardware. With the Chatbot, we explore stitching context through conversational threads, rendering responses via realtime token streaming from LLM, and using external data to provide context for the LLM response (Retrieval Augmented Generation). With the Document Manager, we explore uploading documents to a Vector Database to use in retrieval augmented generation, allowing our Chatbot to produce answers grounded in knowledge that we provide.

Below are steps to get the Chatbot running.

Chatbot

The Chatbot can be launched as a Docker container or via command line.

Docker
# Create placeholder prompts.json
touch prompts.json

# Run Chatbot via Container - see run.sh for additional settings
docker run \
    -d \
    -p 5000:5000 \
    -e PORT=5000 \
    -e OPENAI_API_BASE="http://localhost:8000/v1" \
    -e TZ="America/Los_Angeles" \
    -v $PWD/.tinyllm:/app/.tinyllm \
    --name chatbot \
    --restart unless-stopped \
    jasonacox/chatbot
Command Line
# Install required packages
pip install fastapi uvicorn python-socketio jinja2 openai bs4 pypdf requests lxml aiohttp weaviate-client

# Run the chatbot web server - change the base URL to be where you host your llmserver
OPENAI_API_BASE="http://localhost:8000/v1" python3 server.py
Chat Commands and Retrieval Augmented Generation (RAG)

Some RAG (Retrieval Augmented Generation) features including:

  • Summarizing external websites and PDFs (paste a URL in chat window)
  • If a Weaviate host is specified, the chatbot can use the vector database information to respond. See rag for details on how to set up Weaviate.
  • Command - There are information commands using /
/reset                                  # Reset session
/version                                # Display chatbot version
/sessions                               # Display current sessions
/news                                   # List top 10 headlines from current new
/stock [company]                        # Display stock symbol and current price
/weather [location]                     # Provide current weather conditions
/rag on {library} {opt:number}          # Activate RAG auto-query using library
/rag off                                # Disable RAG auto-query
/rag [library] [opt:number] [prompt]    # Answer single prompt based on response from RAG

See the rag for more details about RAG.

Example Session

The examples below use a Llama 2 7B model served up with the OpenAI API compatible server on an Intel i5 systems with an Nvidia GeForce GTX 1060 GPU.

Chatbot

Open http://127.0.0.1:5000 - Example session:

image

Read URL

If a URL is pasted in the text box, the chatbot will read and summarize it.

image

Current News

The /news command will fetch the latest news and have the LLM summarize the top ten headlines. It will store the raw feed in the context prompt to allow follow-up questions.

image

Docker Pull Command

docker pull jasonacox/chatbot