乐闻世界logo
搜索文章和话题

What are the deployment and best practices for Ollama in production environments?

2月19日 19:50

When deploying Ollama in production environments, consider the following key aspects:

1. System Requirements:

Hardware Requirements:

  • CPU: Modern processor with AVX2 instruction set support
  • Memory: At least 8GB RAM, recommended 16GB+
  • Storage: SSD storage, 4-20GB per model
  • GPU (Optional): NVIDIA GPU (CUDA 11.0+) or Apple Silicon (M1/M2/M3)

Operating System:

  • Linux (recommended Ubuntu 20.04+)
  • macOS 11+
  • Windows 10/11

2. Deployment Architecture:

Single Machine Deployment:

bash
# Install and start service ollama serve # Listens on 0.0.0.0:11434 by default

Docker Deployment:

dockerfile
FROM ollama/ollama # Copy custom models COPY my-model.gguf /root/.ollama/models/ # Start service CMD ["ollama", "serve"]
bash
# Run container docker run -d -v ollama:/root/.ollama -p 11434:11434 --gpus all ollama/ollama

3. Load Balancing:

Using Nginx as reverse proxy:

nginx
upstream ollama_backend { server 192.168.1.10:11434; server 192.168.1.11:11434; server 192.168.1.12:11434; } server { listen 80; server_name ollama.example.com; location / { proxy_pass http://ollama_backend; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } }

4. Monitoring and Logging:

Health Check:

bash
curl http://localhost:11434/api/tags

Log Management:

bash
# View real-time logs ollama logs -f # Configure log level export OLLAMA_LOG_LEVEL=debug

5. Security Configuration:

API Authentication: Add authentication using reverse proxy:

nginx
location /api/ { auth_basic "Restricted"; auth_basic_user_file /etc/nginx/.htpasswd; proxy_pass http://localhost:11434/api/; }

Firewall Configuration:

bash
# Only allow specific IPs ufw allow from 192.168.1.0/24 to any port 11434

6. Performance Optimization:

Model Preloading:

bash
# Preload models on startup ollama run llama3.1 &

Concurrent Processing:

dockerfile
# Set in Modelfile PARAMETER num_parallel 4

7. Backup and Recovery:

bash
# Backup models tar -czf ollama-backup.tar.gz ~/.ollama/ # Restore models tar -xzf ollama-backup.tar.gz -C ~/
标签:Ollama