mirror of https://github.com/ollama/ollama-python.git synced 2026-01-14 06:07:17 +08:00

Ollama Python library

Go to file

Jonathan Gastón Löwenstern bd49f558dd types: allow additional properties in tool function schemas Enable Pydantic's extra='allow' for Parameters and Property classes to support complex JSON schemas with arbitrary additional fields. This allows complex JSON schemas with additional fields when sending tool definitions to the Ollama API. Fixes compatibility with ollama#11444"		2025-07-16 23:02:37 +02:00
.github	Merge pull request #445 from ollama/mxyng/hatch	2025-05-06 11:03:28 -07:00
examples	types: add context_length to ProcessResponse (#538 )	2025-07-09 15:40:00 -07:00
ollama	types: allow additional properties in tool function schemas	2025-07-16 23:02:37 +02:00
tests	Merge pull request #445 from ollama/mxyng/hatch	2025-05-06 11:03:28 -07:00
.gitignore	add .gitignore	2023-12-20 15:54:51 -08:00
LICENSE	initial commit	2023-12-20 12:09:49 -08:00
pyproject.toml	types/examples: add tool_name to message and examples (#537 )	2025-07-09 14:23:33 -07:00
README.md	Fix create examples (#421 )	2025-01-15 22:43:44 -08:00
requirements.txt	Merge pull request #445 from ollama/mxyng/hatch	2025-05-06 11:03:28 -07:00
SECURITY.md	add SECURITY.md (#479 )	2025-03-20 13:25:36 -07:00
uv.lock	Merge pull request #445 from ollama/mxyng/hatch	2025-05-06 11:03:28 -07:00

README.md

Ollama Python Library

The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.

Prerequisites

Ollama should be installed and running
Pull a model to use with the library: ollama pull <model> e.g. ollama pull llama3.2
- See Ollama.com for more information on the models available.

Install

pip install ollama

Usage

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='llama3.2', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)

See _types.py for more information on the response types.

Streaming responses

Response streaming can be enabled by setting stream=True.

from ollama import chat

stream = chat(
    model='llama3.2',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

Custom client

A custom client can be created by instantiating Client or AsyncClient from ollama.

All extra keyword arguments are passed into the httpx.Client.

from ollama import Client
client = Client(
  host='http://localhost:11434',
  headers={'x-some-header': 'some-value'}
)
response = client.chat(model='llama3.2', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])

Async client

The AsyncClient class is used to make asynchronous requests. It can be configured with the same fields as the Client class.

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  response = await AsyncClient().chat(model='llama3.2', messages=[message])

asyncio.run(chat())

Setting stream=True modifies functions to return a Python asynchronous generator:

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  async for part in await AsyncClient().chat(model='llama3.2', messages=[message], stream=True):
    print(part['message']['content'], end='', flush=True)

asyncio.run(chat())

API

The Ollama Python library's API is designed around the Ollama REST API

Chat

ollama.chat(model='llama3.2', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])

Generate

ollama.generate(model='llama3.2', prompt='Why is the sky blue?')

List

ollama.list()

Show

ollama.show('llama3.2')

Create

ollama.create(model='example', from_='llama3.2', system="You are Mario from Super Mario Bros.")

Copy

ollama.copy('llama3.2', 'user/llama3.2')

Delete

ollama.delete('llama3.2')

Pull

ollama.pull('llama3.2')

Push

ollama.push('user/llama3.2')

Embed

ollama.embed(model='llama3.2', input='The sky is blue because of rayleigh scattering')

Embed (batch)

ollama.embed(model='llama3.2', input=['The sky is blue because of rayleigh scattering', 'Grass is green because of chlorophyll'])

Ps

ollama.ps()

Errors

Errors are raised if requests return an error status or if an error is detected while streaming.

model = 'does-not-yet-exist'

try:
  ollama.chat(model)
except ollama.ResponseError as e:
  print('Error:', e.error)
  if e.status_code == 404:
    ollama.pull(model)