Open-WebUI: Self-Hosted ChatGPT UI + Ollama
Open-WebUI is an open-source ChatGPT clone that runs locally and connects to Ollama, OpenAI, or other LLM backends. This guide covers installation with Ollama and fixes for common GitHub issues.
Quick Overview
- • Open-WebUI: ChatGPT-like web interface, self-hosted
- • Ollama: Local LLM runtime (Llama 3, Mistral, etc.)
- • Requirements: Docker + 8GB+ RAM (16GB for 7B models)
- • Benefits: Privacy, no API costs, full control
Quick Start with Docker
Fastest way to get running - Docker Compose with both Open-WebUI and Ollama:
# Create docker-compose.yml
cat > docker-compose.yml << 'EOF'
version: "3.8"
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
- WEBUI_SECRET_KEY=your-secret-key-here
volumes:
- ./open-webui/data:/app/backend/data
restart: unless-stopped
ollama:
image: ollama/ollama:latest
container_name: ollama
ports:
- "11434:11434"
volumes:
- ./ollama/data:/root/.ollama
restart: unless-stopped
EOF
# Start everything
docker-compose up -d
# Pull a model (optional, downloads on first use)
docker exec -it ollama ollama pull llama3
# Open http://localhost:3000
Connecting Ollama Models
Once running, Open-WebUI auto-discovers Ollama. Here's how to pull and use models:
# List available models
docker exec -it ollama ollama list
# Pull popular models
docker exec -it ollama ollama pull llama3 # 4.7GB - Meta's Llama 3 8B
docker exec -it ollama ollama pull mistral # 4.1GB - Mistral 7B
docker exec -it ollama ollama pull codellama # 3.8GB - Code specialist
docker exec -it ollama ollama pull phi3 # 2.2GB - Microsoft's 3.8B
docker exec -it ollama ollama pull gemma:2b # 1.6GB - Google's Gemma 2B
# Refresh Open-WebUI to see new models
# Click "Settings" → "Models" → Select from dropdown
Common Problems & Solutions
Problem: Open-WebUI shows "Error connecting to Ollama" even though Ollama is running.
What I Tried: Restarting containers, checking ports - Ollama worked via CLI but not from Open-WebUI.
Actual Fix: Docker networking issue. When both run in Docker, localhost doesn't work. Use the service name:
# WRONG:
environment:
- OLLAMA_BASE_URL=http://localhost:11434
# RIGHT:
environment:
- OLLAMA_BASE_URL=http://ollama:11434
# Or if Ollama is on host (not in Docker):
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434
# Add extra_hosts for host.docker.internal:
extra_hosts:
- "host.docker.internal:host-gateway"
The ollama in the URL matches the service name in docker-compose.yml, not localhost.
Problem: Pulled models in Ollama, but Open-WebUI shows "No models available" in the dropdown.
What I Tried: Refreshing page, restarting Open-WebUI - models still missing.
Actual Fix: Open-WebUI caches the model list. Need to trigger a refresh via API or manual config:
# Solution 1: Trigger refresh via API
curl -X POST http://localhost:3000/api/config/updates/models \
-H "Content-Type: application/json"
# Solution 2: Restart Open-WebUI after pulling models
docker-compose restart open-webui
# Solution 3: Add models manually in Settings
# Go to Settings → Providers → Ollama → Add Model
# Enter: "llama3", "mistral", etc.
# Solution 4: Check Ollama is actually reachable
docker exec -it open-webui curl http://ollama:11434/api/tags
# Should return JSON with models list
Problem: Updated Open-WebUI, now it asks for login but never created an admin password. Can't access the UI.
What I Tried: Reinstalling, clearing data - kept asking for authentication.
Actual Fix: Newer versions enable auth by default. Need to create first user via CLI or disable auth:
# Option 1: Disable authentication (development only)
environment:
- WEBUI_AUTH=false # Disables login requirement
# Option 2: Create admin user via CLI
docker exec -it open-webui python -c "
from backend.models import User
from backend.database import DB
db = DB()
user = User(
username='admin',
email='admin@example.com',
password='your-password',
role='admin',
is_active=True
)
db.add(user)
db.commit()
"
# Option 3: Reset auth and create new user
# Stop container
docker-compose down
# Remove auth database
rm -rf ./open-webui/data/webui.db
# Start and create new account at first visit
docker-compose up -d
Problem: Responses take 10-20 seconds even for simple queries. CPU usage is low, not a hardware bottleneck.
What I Tried: Switching models, increasing Ollama memory - minimal improvement.
Actual Fix: Two issues: Ollama's default context window is too large, and Open-WebUI wasn't using GPU acceleration:
# 1. Enable GPU in Ollama (if available)
docker exec -it ollama ollama run llama3 --gpu
# 2. Reduce context size for faster responses
# In Open-WebUI: Settings → Models → llama3 → Parameters
# Set "Max Tokens" to 2048 (default is 4096)
# 3. Enable streaming
environment:
- ENABLE_STREAMING=true
# 4. Use faster quantization
docker exec -it ollama ollama pull llama3:instruct-q4_K_M
# q4_K_M is faster than default q8_0 with minimal quality loss
# 5. Adjust Ollama num_thread
docker exec -it ollama ollama run llama3 --num-thread 8
# Set to your CPU core count
With these changes, response time dropped from 15s to 3-4s on my system.
Problem: Chat history disappears when refreshing page or restarting container.
What I Tried: Checking volume mounts - they looked correct but data wasn't persisting.
Actual Fix: The data directory path changed in newer versions. Update volume mount and check permissions:
# OLD path (doesn't work in 0.3.x+):
volumes:
- ./open-webui:/app/backend/data
# NEW path (0.3.x+):
volumes:
- ./open-webui/data:/app/backend/data
# Also ensure directory exists with correct permissions:
mkdir -p ./open-webui/data
chmod 777 ./open-webui/data
# Check inside container:
docker exec -it open-webui ls -la /app/backend/data
# Should show webui.db file
If still losing data, check if SQLite database is being written to /tmp instead of mounted volume.
Connecting OpenAI API
You can mix local models with OpenAI for GPT-4 access when needed:
# Add OpenAI as fallback provider
environment:
- OPENAI_API_KEY=sk-your-key-here
- OPENAI_API_BASE=https://api.openai.com/v1
# Or use a proxy:
- OPENAI_API_BASE=https://your-proxy.com/v1
In the UI, select "OpenAI" from the model dropdown to use GPT-4, or "Ollama" for local models.
Model Selection Guide
| Model | Size | Best For | RAM |
|---|---|---|---|
| llama3 | 4.7GB | General purpose | 8GB |
| mistral | 4.1GB | Chat, reasoning | 8GB |
| codellama | 3.8GB | Code generation | 8GB |
| phi3 | 2.2GB | Fast, lightweight | 4GB |
| gemma:2b | 1.6GB | Smallest viable | 3GB |
Performance Tips
GPU Acceleration
# Enable GPU in Ollama
services:
ollama:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Quantization Trade-offs
# Smaller quantization = faster, less RAM, more quality loss
docker exec -it ollama ollama pull llama3:q4_K_M # Fast, good quality
docker exec -it ollama ollama pull llama3:q8_0 # Slower, best quality
docker exec -it ollama ollama pull llama3:q2_K # Fastest, most quality loss
Reverse Proxy with HTTPS
# nginx.conf
server {
listen 443 ssl;
server_name chat.yourdomain.com;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
Recommended Reading
No-code AI apps with knowledge base
Drag-and-drop LLM workflows