mirror of https://github.com/ollama/ollama-python.git synced 2026-01-13 21:57:16 +08:00

Ollama Python library

Go to file

Bruce MacDonald e3733a235d fix tests		2024-01-12 11:03:40 -05:00
.github/workflows	fix gh release upload	2024-01-11 16:52:17 -08:00
examples	examples	2023-12-21 17:00:31 -08:00
ollama	fix tests	2024-01-12 11:03:40 -05:00
tests	fix tests	2024-01-12 11:03:40 -05:00
.gitignore	add .gitignore	2023-12-20 15:54:51 -08:00
LICENSE	initial commit	2023-12-20 12:09:49 -08:00
poetry.lock	async client	2023-12-21 09:17:07 -08:00
pyproject.toml	format	2023-12-21 14:21:02 -08:00
README.md	parse json error	2024-01-10 17:02:49 -08:00
requirements.txt	initial commit	2023-12-20 12:09:49 -08:00

README.md

Ollama Python Library

The Ollama Python library provides the easiest way to integrate your Python 3 project with Ollama.

Getting Started

Requires Python 3.8 or higher.

pip install ollama

A global default client is provided for convenience and can be used in the same way as the synchronous client.

import ollama
response = ollama.chat(model='llama2', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])

import ollama
message = {'role': 'user', 'content': 'Why is the sky blue?'}
for part in ollama.chat(model='llama2', messages=[message], stream=True):
  print(part['message']['content'], end='', flush=True)

Using the Synchronous Client

from ollama import Client
message = {'role': 'user', 'content': 'Why is the sky blue?'}
response = Client().chat(model='llama2', messages=[message])

Response streaming can be enabled by setting stream=True. This modifies the function to return a Python generator where each part is an object in the stream.

from ollama import Client
message = {'role': 'user', 'content': 'Why is the sky blue?'}
for part in Client().chat(model='llama2', messages=[message], stream=True):
  print(part['message']['content'], end='', flush=True)

Using the Asynchronous Client

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  response = await AsyncClient().chat(model='llama2', messages=[message])

asyncio.run(chat())

Similar to the synchronous client, setting stream=True modifies the function to return a Python asynchronous generator.

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  async for part in await AsyncClient().chat(model='llama2', messages=[message], stream=True):
    print(part['message']['content'], end='', flush=True)

asyncio.run(chat())

Handling Errors

Errors are raised if requests return an error status or if an error is detected while streaming.

model = 'does-not-yet-exist'

try:
  ollama.chat(model)
except ollama.ResponseError as e:
  print('Error:', e.content)
  if e.status_code == 404:
    ollama.pull(model)