What are the main endpoints of Ollama API and how to use them? - 面试题

Ollama provides a complete RESTful API that runs by default on http://localhost:11434. The main API endpoints include:

1. Generate Text (POST /api/generate):

bash
curl http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt": "Hello, how are you?",
  "stream": false
}'

2. Chat (POST /api/chat):

bash
curl http://localhost:11434/api/chat -d '{
  "model": "llama2",
  "messages": [
    { "role": "user", "content": "Hello!" },
    { "role": "assistant", "content": "Hi there!" },
    { "role": "user", "content": "How are you?" }
  ]
}'

3. List Models (GET /api/tags):

bash
curl http://localhost:11434/api/tags

4. Show Model Info (POST /api/show):

bash
curl http://localhost:11434/api/show -d '{
  "name": "llama2"
}'

5. Copy Model (POST /api/copy):

bash
curl http://localhost:11434/api/copy -d '{
  "source": "llama2",
  "destination": "my-llama2"
}'

6. Delete Model (DELETE /api/delete):

bash
curl -X DELETE http://localhost:11434/api/delete -d '{
  "name": "llama2"
}'

7. Pull Model (POST /api/pull):

bash
curl http://localhost:11434/api/pull -d '{
  "name": "llama2"
}'

Streaming Response: Set "stream": true to get streaming responses, suitable for real-time display of generated content.

Python Integration Example:

python
import requests

response = requests.post('http://localhost:11434/api/generate', json={
    'model': 'llama2',
    'prompt': 'Tell me a joke',
    'stream': False
})
print(response.json()['response'])