Compare commits

..

14 Commits

Author SHA1 Message Date
nicole pardal 8dd1d7cb02 removed comments 2025-09-24 11:49:19 -07:00
nicole pardal d3afa37b11 remove unused import 2025-09-24 11:44:55 -07:00
nicole pardal cfbb0cef7b formatting fix 2025-09-24 11:43:33 -07:00
nicole pardal 80279e95ab cleaned up code 2025-09-23 21:34:22 -07:00
nicole pardal 0ecfb1f6cf api key fix added 2025-09-23 18:07:53 -07:00
nicole pardal 404672570f lint 2025-09-23 18:02:02 -07:00
nicole pardal 15ec61dbcb lint 2025-09-23 17:58:08 -07:00
nicole pardal 799ae1f07c can lint pls work 2025-09-23 17:49:59 -07:00
nicole pardal ae333084b9 renamed + added functionality 2025-09-23 17:44:18 -07:00
nicole pardal 1c6afe4316 lint formatting 2025-09-23 15:46:50 -07:00
nicole pardal b9d435fad5 fixed nits 2025-09-23 15:46:50 -07:00
nicole pardal 10955d52ee lint fix hopefully 2025-09-23 15:46:50 -07:00
nicole pardal 67f19a33e2 fix for failing lint check 2025-09-23 15:46:50 -07:00
nicole pardal 4d83af13d8 Added python browser tool 2025-09-23 15:46:50 -07:00
18 changed files with 445 additions and 714 deletions
+1 -1
View File
@@ -13,7 +13,7 @@ jobs:
id-token: write
contents: write
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- uses: actions/setup-python@v6
- uses: astral-sh/setup-uv@v5
with:
+2 -2
View File
@@ -10,7 +10,7 @@ jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- uses: astral-sh/setup-uv@v5
with:
enable-cache: true
@@ -19,7 +19,7 @@ jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- uses: actions/setup-python@v6
- uses: astral-sh/setup-uv@v5
with:
+1 -76
View File
@@ -50,82 +50,6 @@ for chunk in stream:
print(chunk['message']['content'], end='', flush=True)
```
## Cloud Models
Run larger models by offloading to Ollamas cloud while keeping your local workflow.
- Supported models: `deepseek-v3.1:671b-cloud`, `gpt-oss:20b-cloud`, `gpt-oss:120b-cloud`, `kimi-k2:1t-cloud`, `qwen3-coder:480b-cloud`, `kimi-k2-thinking` See [Ollama Models - Cloud](https://ollama.com/search?c=cloud) for more information
### Run via local Ollama
1) Sign in (one-time):
```
ollama signin
```
2) Pull a cloud model:
```
ollama pull gpt-oss:120b-cloud
```
3) Make a request:
```python
from ollama import Client
client = Client()
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
for part in client.chat('gpt-oss:120b-cloud', messages=messages, stream=True):
print(part.message.content, end='', flush=True)
```
### Cloud API (ollama.com)
Access cloud models directly by pointing the client at `https://ollama.com`.
1) Create an API key from [ollama.com](https://ollama.com/settings/keys) , then set:
```
export OLLAMA_API_KEY=your_api_key
```
2) (Optional) List models available via the API:
```
curl https://ollama.com/api/tags
```
3) Generate a response via the cloud API:
```python
import os
from ollama import Client
client = Client(
host='https://ollama.com',
headers={'Authorization': 'Bearer ' + os.environ.get('OLLAMA_API_KEY')}
)
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
for part in client.chat('gpt-oss:120b', messages=messages, stream=True):
print(part.message.content, end='', flush=True)
```
## Custom client
A custom client can be created by instantiating `Client` or `AsyncClient` from `ollama`.
@@ -250,6 +174,7 @@ ollama.embed(model='gemma3', input=['The sky is blue because of rayleigh scatter
ollama.ps()
```
## Errors
Errors are raised if requests return an error status or if an error is detected while streaming.
+13 -62
View File
@@ -1,129 +1,80 @@
# Running Examples
Run the examples in this directory with:
```sh
# Run example
python3 examples/<example>.py
# or with uv
uv run examples/<example>.py
```
See [ollama/docs/api.md](https://github.com/ollama/ollama/blob/main/docs/api.md) for full API documentation
### Chat - Chat with a model
- [chat.py](chat.py)
- [async-chat.py](async-chat.py)
- [chat-stream.py](chat-stream.py) - Streamed outputs
- [chat-with-history.py](chat-with-history.py) - Chat with model and maintain history of the conversation
### Generate - Generate text with a model
### Generate - Generate text with a model
- [generate.py](generate.py)
- [async-generate.py](async-generate.py)
- [generate-stream.py](generate-stream.py) - Streamed outputs
- [fill-in-middle.py](fill-in-middle.py) - Given a prefix and suffix, fill in the middle
### Tools/Function Calling - Call a function with a model
### Tools/Function Calling - Call a function with a model
- [tools.py](tools.py) - Simple example of Tools/Function Calling
- [async-tools.py](async-tools.py)
- [multi-tool.py](multi-tool.py) - Using multiple tools, with thinking enabled
#### gpt-oss
#### gpt-oss
- [gpt-oss-tools.py](gpt-oss-tools.py)
- [gpt-oss-tools-stream.py](gpt-oss-tools-stream.py)
- [gpt-oss-tools-stream.py](gpt-oss-tools-stream.py)
- [gpt-oss-tools-browser.py](gpt-oss-tools-browser.py) - Using browser research tools with gpt-oss
- [gpt-oss-tools-browser-stream.py](gpt-oss-tools-browser-stream.py) - Using browser research tools with gpt-oss, with streaming enabled
### Web search
An API key from Ollama's cloud service is required. You can create one [here](https://ollama.com/settings/keys).
```shell
export OLLAMA_API_KEY="your_api_key_here"
```
- [web-search.py](web-search.py)
- [web-search-gpt-oss.py](web-search-gpt-oss.py) - Using browser research tools with gpt-oss
#### MCP server
The MCP server can be used with an MCP client like Cursor, Cline, Codex, Open WebUI, Goose, and more.
```sh
uv run examples/web-search-mcp.py
```
Configuration to use with an MCP client:
```json
{
"mcpServers": {
"web_search": {
"type": "stdio",
"command": "uv",
"args": ["run", "path/to/ollama-python/examples/web-search-mcp.py"],
"env": { "OLLAMA_API_KEY": "your_api_key_here" }
}
}
}
```
- [web-search-mcp.py](web-search-mcp.py)
### Multimodal with Images - Chat with a multimodal (image chat) model
- [multimodal-chat.py](multimodal-chat.py)
- [multimodal-generate.py](multimodal-generate.py)
### Image Generation (Experimental) - Generate images with a model
> **Note:** Image generation is experimental and currently only available on macOS.
- [generate-image.py](generate-image.py)
### Structured Outputs - Generate structured outputs with a model
- [structured-outputs.py](structured-outputs.py)
- [async-structured-outputs.py](async-structured-outputs.py)
- [structured-outputs-image.py](structured-outputs-image.py)
### Ollama List - List all downloaded models and their properties
### Ollama List - List all downloaded models and their properties
- [list.py](list.py)
### Ollama Show - Display model properties and capabilities
### Ollama Show - Display model properties and capabilities
- [show.py](show.py)
### Ollama ps - Show model status with CPU/GPU usage
### Ollama ps - Show model status with CPU/GPU usage
- [ps.py](ps.py)
### Ollama Pull - Pull a model from Ollama
Requirement: `pip install tqdm`
- [pull.py](pull.py)
- [pull.py](pull.py)
### Ollama Create - Create a model from a Modelfile
- [create.py](create.py)
- [create.py](create.py)
### Ollama Embed - Generate embeddings with a model
- [embed.py](embed.py)
### Thinking - Enable thinking mode for a model
### Thinking - Enable thinking mode for a model
- [thinking.py](thinking.py)
### Thinking (generate) - Enable thinking mode for a model
- [thinking-generate.py](thinking-generate.py)
### Thinking (levels) - Choose the thinking level
- [thinking-levels.py](thinking-levels.py)
-31
View File
@@ -1,31 +0,0 @@
from typing import Iterable
import ollama
def print_logprobs(logprobs: Iterable[dict], label: str) -> None:
print(f'\n{label}:')
for entry in logprobs:
token = entry.get('token', '')
logprob = entry.get('logprob')
print(f' token={token!r:<12} logprob={logprob:.3f}')
for alt in entry.get('top_logprobs', []):
if alt['token'] != token:
print(f' alt -> {alt["token"]!r:<12} ({alt["logprob"]:.3f})')
messages = [
{
'role': 'user',
'content': 'hi! be concise.',
},
]
response = ollama.chat(
model='gemma3',
messages=messages,
logprobs=True,
top_logprobs=3,
)
print('Chat response:', response['message']['content'])
print_logprobs(response.get('logprobs', []), 'chat logprobs')
+1 -2
View File
@@ -15,8 +15,7 @@ messages = [
},
{
'role': 'assistant',
'content': """The weather in Tokyo is typically warm and humid during the summer months, with temperatures often exceeding 30°C (86°F). The city experiences a rainy season from June to September, with heavy rainfall and occasional typhoons. Winter is mild, with temperatures
rarely dropping below freezing. The city is known for its high-tech and vibrant culture, with many popular tourist attractions such as the Tokyo Tower, Senso-ji Temple, and the bustling Shibuya district.""",
'content': 'The weather in Tokyo is typically warm and humid during the summer months, with temperatures often exceeding 30°C (86°F). The city experiences a rainy season from June to September, with heavy rainfall and occasional typhoons. Winter is mild, with temperatures rarely dropping below freezing. The city is known for its high-tech and vibrant culture, with many popular tourist attractions such as the Tokyo Tower, Senso-ji Temple, and the bustling Shibuya district.',
},
]
-18
View File
@@ -1,18 +0,0 @@
# Image generation is experimental and currently only available on macOS
import base64
from ollama import generate
prompt = 'a sunset over mountains'
print(f'Prompt: {prompt}')
for response in generate(model='x/z-image-turbo', prompt=prompt, stream=True):
if response.image:
# Final response contains the image
with open('output.png', 'wb') as f:
f.write(base64.b64decode(response.image))
print('\nImage saved to output.png')
elif response.total:
# Progress update
print(f'Progress: {response.completed or 0}/{response.total}', end='\r')
-24
View File
@@ -1,24 +0,0 @@
from typing import Iterable
import ollama
def print_logprobs(logprobs: Iterable[dict], label: str) -> None:
print(f'\n{label}:')
for entry in logprobs:
token = entry.get('token', '')
logprob = entry.get('logprob')
print(f' token={token!r:<12} logprob={logprob:.3f}')
for alt in entry.get('top_logprobs', []):
if alt['token'] != token:
print(f' alt -> {alt["token"]!r:<12} ({alt["logprob"]:.3f})')
response = ollama.generate(
model='gemma3',
prompt='hi! be concise.',
logprobs=True,
top_logprobs=3,
)
print('Generate response:', response['response'])
print_logprobs(response.get('logprobs', []), 'generate logprobs')
@@ -1,12 +1,8 @@
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "ollama",
# ]
# ///
from __future__ import annotations
from typing import Any, Dict, List
from web_search_gpt_oss_helper import Browser
from gpt_oss_browser_tool_helper import Browser
from ollama import Client
@@ -15,6 +11,52 @@ def main() -> None:
client = Client()
browser = Browser(initial_state=None, client=client)
browser_search_schema = {
'type': 'function',
'function': {
'name': 'browser.search',
'parameters': {
'type': 'object',
'properties': {
'query': {'type': 'string'},
'topn': {'type': 'integer'},
},
'required': ['query'],
},
},
}
browser_open_schema = {
'type': 'function',
'function': {
'name': 'browser.open',
'parameters': {
'type': 'object',
'properties': {
'id': {'anyOf': [{'type': 'integer'}, {'type': 'string'}]},
'cursor': {'type': 'integer'},
'loc': {'type': 'integer'},
'num_lines': {'type': 'integer'},
},
},
},
}
browser_find_schema = {
'type': 'function',
'function': {
'name': 'browser.find',
'parameters': {
'type': 'object',
'properties': {
'pattern': {'type': 'string'},
'cursor': {'type': 'integer'},
},
'required': ['pattern'],
},
},
}
def browser_search(query: str, topn: int = 10) -> str:
return browser.search(query=query, topn=topn)['pageText']
@@ -24,41 +66,19 @@ def main() -> None:
def browser_find(pattern: str, cursor: int = -1, **_: Any) -> str:
return browser.find(pattern=pattern, cursor=cursor)['pageText']
browser_search_schema = {
'type': 'function',
'function': {
'name': 'browser.search',
},
}
browser_open_schema = {
'type': 'function',
'function': {
'name': 'browser.open',
},
}
browser_find_schema = {
'type': 'function',
'function': {
'name': 'browser.find',
},
}
available_tools = {
'browser.search': browser_search,
'browser.open': browser_open,
'browser.find': browser_find,
}
query = "what is ollama's new engine"
query = 'What is Ollama.com?'
print('Prompt:', query, '\n')
messages: List[Dict[str, Any]] = [{'role': 'user', 'content': query}]
while True:
resp = client.chat(
model='gpt-oss:120b-cloud',
model='gpt-oss',
messages=messages,
tools=[browser_search_schema, browser_open_schema, browser_find_schema],
think=True,
@@ -80,7 +100,6 @@ def main() -> None:
for tc in resp.message.tool_calls:
tool_name = tc.function.name
args = tc.function.arguments or {}
print(f'Tool name: {tool_name}, args: {args}')
fn = available_tools.get(tool_name)
if not fn:
messages.append({'role': 'tool', 'content': f'Tool {tool_name} not found', 'tool_name': tool_name})
@@ -88,7 +107,6 @@ def main() -> None:
try:
result_text = fn(**args)
print('Result: ', result_text[:200] + '...')
except Exception as e:
result_text = f'Error from {tool_name}: {e}'
+198
View File
@@ -0,0 +1,198 @@
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "gpt-oss",
# "ollama",
# "rich",
# ]
# ///
import asyncio
import json
from typing import Iterator, Optional
from gpt_oss.tools.simple_browser import ExaBackend, SimpleBrowserTool
from openai_harmony import Author, Role, TextContent
from openai_harmony import Message as HarmonyMessage
from rich import print
from ollama import Client
from ollama._types import ChatResponse
_backend = ExaBackend(source='web')
_browser_tool = SimpleBrowserTool(backend=_backend)
def heading(text):
print(text)
print('=' * (len(text) + 3))
async def _browser_search_async(query: str, topn: int = 10, source: str | None = None) -> str:
# map Ollama message to Harmony format
harmony_message = HarmonyMessage(
author=Author(role=Role.USER),
content=[TextContent(text=json.dumps({'query': query, 'topn': topn}))],
recipient='browser.search',
)
result_text: str = ''
async for response in _browser_tool._process(harmony_message):
if response.content:
for content in response.content:
if isinstance(content, TextContent):
result_text += content.text
return result_text or f'No results for query: {query}'
async def _browser_open_async(id: int | str = -1, cursor: int = -1, loc: int = -1, num_lines: int = -1, *, view_source: bool = False, source: str | None = None) -> str:
payload = {'id': id, 'cursor': cursor, 'loc': loc, 'num_lines': num_lines, 'view_source': view_source, 'source': source}
harmony_message = HarmonyMessage(
author=Author(role=Role.USER),
content=[TextContent(text=json.dumps(payload))],
recipient='browser.open',
)
result_text: str = ''
async for response in _browser_tool._process(harmony_message):
if response.content:
for content in response.content:
if isinstance(content, TextContent):
result_text += content.text
return result_text or f'Could not open: {id}'
async def _browser_find_async(pattern: str, cursor: int = -1) -> str:
payload = {'pattern': pattern, 'cursor': cursor}
harmony_message = HarmonyMessage(
author=Author(role=Role.USER),
content=[TextContent(text=json.dumps(payload))],
recipient='browser.find',
)
result_text: str = ''
async for response in _browser_tool._process(harmony_message):
if response.content:
for content in response.content:
if isinstance(content, TextContent):
result_text += content.text
return result_text or f'Pattern not found: {pattern}'
def browser_search(query: str, topn: int = 10, source: Optional[str] = None) -> str:
return asyncio.run(_browser_search_async(query=query, topn=topn, source=source))
def browser_open(id: int | str = -1, cursor: int = -1, loc: int = -1, num_lines: int = -1, *, view_source: bool = False, source: Optional[str] = None) -> str:
return asyncio.run(_browser_open_async(id=id, cursor=cursor, loc=loc, num_lines=num_lines, view_source=view_source, source=source))
def browser_find(pattern: str, cursor: int = -1) -> str:
return asyncio.run(_browser_find_async(pattern=pattern, cursor=cursor))
# Schema definitions for each browser tool
browser_search_schema = {
'type': 'function',
'function': {
'name': 'browser.search',
},
}
browser_open_schema = {
'type': 'function',
'function': {
'name': 'browser.open',
},
}
browser_find_schema = {
'type': 'function',
'function': {
'name': 'browser.find',
},
}
available_tools = {
'browser.search': browser_search,
'browser.open': browser_open,
'browser.find': browser_find,
}
model = 'gpt-oss:20b'
print('Model: ', model, '\n')
prompt = 'What is Ollama?'
print('You: ', prompt, '\n')
messages = [{'role': 'user', 'content': prompt}]
client = Client()
# gpt-oss can call tools while "thinking"
# a loop is needed to call the tools and get the results
final = True
while True:
response_stream: Iterator[ChatResponse] = client.chat(
model=model,
messages=messages,
tools=[browser_search_schema, browser_open_schema, browser_find_schema],
options={'num_ctx': 8192}, # 8192 is the recommended lower limit for the context window
stream=True,
)
tool_calls = []
thinking = ''
content = ''
for chunk in response_stream:
if chunk.message.tool_calls:
tool_calls.extend(chunk.message.tool_calls)
if chunk.message.content:
if not (chunk.message.thinking or chunk.message.thinking == '') and final:
heading('\n\nFinal result: ')
final = False
print(chunk.message.content, end='', flush=True)
if chunk.message.thinking:
thinking += chunk.message.thinking
print(chunk.message.thinking, end='', flush=True)
if thinking != '':
messages.append({'role': 'assistant', 'content': thinking, 'tool_calls': tool_calls})
print()
if tool_calls:
for tool_call in tool_calls:
tool_name = tool_call.function.name
args = tool_call.function.arguments or {}
function_to_call = available_tools.get(tool_name)
if function_to_call:
heading(f'\nCalling tool: {tool_name}')
if args:
print(f'Arguments: {args}')
try:
result = function_to_call(**args)
print(f'Tool result: {result[:200]}')
if len(result) > 200:
heading('... [truncated]')
print()
result_message = {'role': 'tool', 'content': result, 'tool_name': tool_name}
messages.append(result_message)
except Exception as e:
err = f'Error from {tool_name}: {e}'
print(err)
messages.append({'role': 'tool', 'content': err, 'tool_name': tool_name})
else:
print(f'Tool {tool_name} not found')
else:
# no more tool calls, we can stop the loop
break
+175
View File
@@ -0,0 +1,175 @@
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "gpt-oss",
# "ollama",
# "rich",
# ]
# ///
import asyncio
import json
from typing import Optional
from gpt_oss.tools.simple_browser import ExaBackend, SimpleBrowserTool
from openai_harmony import Author, Role, TextContent
from openai_harmony import Message as HarmonyMessage
from ollama import Client
_backend = ExaBackend(source='web')
_browser_tool = SimpleBrowserTool(backend=_backend)
def heading(text):
print(text)
print('=' * (len(text) + 3))
async def _browser_search_async(query: str, topn: int = 10, source: str | None = None) -> str:
# map Ollama message to Harmony format
harmony_message = HarmonyMessage(
author=Author(role=Role.USER),
content=[TextContent(text=json.dumps({'query': query, 'topn': topn}))],
recipient='browser.search',
)
result_text: str = ''
async for response in _browser_tool._process(harmony_message):
if response.content:
for content in response.content:
if isinstance(content, TextContent):
result_text += content.text
return result_text or f'No results for query: {query}'
async def _browser_open_async(id: int | str = -1, cursor: int = -1, loc: int = -1, num_lines: int = -1, *, view_source: bool = False, source: str | None = None) -> str:
payload = {'id': id, 'cursor': cursor, 'loc': loc, 'num_lines': num_lines, 'view_source': view_source, 'source': source}
harmony_message = HarmonyMessage(
author=Author(role=Role.USER),
content=[TextContent(text=json.dumps(payload))],
recipient='browser.open',
)
result_text: str = ''
async for response in _browser_tool._process(harmony_message):
if response.content:
for content in response.content:
if isinstance(content, TextContent):
result_text += content.text
return result_text or f'Could not open: {id}'
async def _browser_find_async(pattern: str, cursor: int = -1) -> str:
payload = {'pattern': pattern, 'cursor': cursor}
harmony_message = HarmonyMessage(
author=Author(role=Role.USER),
content=[TextContent(text=json.dumps(payload))],
recipient='browser.find',
)
result_text: str = ''
async for response in _browser_tool._process(harmony_message):
if response.content:
for content in response.content:
if isinstance(content, TextContent):
result_text += content.text
return result_text or f'Pattern not found: {pattern}'
def browser_search(query: str, topn: int = 10, source: Optional[str] = None) -> str:
return asyncio.run(_browser_search_async(query=query, topn=topn, source=source))
def browser_open(id: int | str = -1, cursor: int = -1, loc: int = -1, num_lines: int = -1, *, view_source: bool = False, source: Optional[str] = None) -> str:
return asyncio.run(_browser_open_async(id=id, cursor=cursor, loc=loc, num_lines=num_lines, view_source=view_source, source=source))
def browser_find(pattern: str, cursor: int = -1) -> str:
return asyncio.run(_browser_find_async(pattern=pattern, cursor=cursor))
# Schema definitions for each browser tool
browser_search_schema = {
'type': 'function',
'function': {
'name': 'browser.search',
},
}
browser_open_schema = {
'type': 'function',
'function': {
'name': 'browser.open',
},
}
browser_find_schema = {
'type': 'function',
'function': {
'name': 'browser.find',
},
}
available_tools = {
'browser.search': browser_search,
'browser.open': browser_open,
'browser.find': browser_find,
}
model = 'gpt-oss:20b'
print('Model: ', model, '\n')
prompt = 'What is Ollama?'
print('You: ', prompt, '\n')
messages = [{'role': 'user', 'content': prompt}]
client = Client()
while True:
response = client.chat(
model=model,
messages=messages,
tools=[browser_search_schema, browser_open_schema, browser_find_schema],
options={'num_ctx': 8192}, # 8192 is the recommended lower limit for the context window
)
if hasattr(response.message, 'thinking') and response.message.thinking:
heading('Thinking')
print(response.message.thinking.strip() + '\n')
if hasattr(response.message, 'content') and response.message.content:
heading('Assistant')
print(response.message.content.strip() + '\n')
# add message to chat history
messages.append(response.message)
if response.message.tool_calls:
for tool_call in response.message.tool_calls:
tool_name = tool_call.function.name
args = tool_call.function.arguments or {}
function_to_call = available_tools.get(tool_name)
if not function_to_call:
print(f'Unknown tool: {tool_name}')
continue
try:
result = function_to_call(**args)
heading(f'Tool: {tool_name}')
if args:
print(f'Arguments: {args}')
print(result[:200])
if len(result) > 200:
print('... [truncated]')
print()
messages.append({'role': 'tool', 'content': result, 'tool_name': tool_name})
except Exception as e:
err = f'Error from {tool_name}: {e}'
print(err)
messages.append({'role': 'tool', 'content': err, 'tool_name': tool_name})
else:
# break on no more tool calls
break
@@ -41,13 +41,9 @@ class CrawlClient(Protocol):
def crawl(self, urls: List[str]): ...
# ---- Constants ---------------------------------------------------------------
DEFAULT_VIEW_TOKENS = 1024
CAPPED_TOOL_CONTENT_LEN = 8000
# ---- Helpers ----------------------------------------------------------------
def cap_tool_content(text: str) -> str:
if not text:
@@ -68,9 +64,6 @@ def _safe_domain(u: str) -> str:
return u
# ---- BrowserState ------------------------------------------------------------
class BrowserState:
def __init__(self, initial_state: Optional[BrowserStateData] = None):
self._data = initial_state or BrowserStateData(view_tokens=DEFAULT_VIEW_TOKENS)
@@ -82,9 +75,6 @@ class BrowserState:
self._data = data
# ---- Browser ----------------------------------------------------------------
class Browser:
def __init__(
self,
@@ -203,8 +193,6 @@ class Browser:
return header + '\n'.join(body_lines)
# ---- page builders ----
def _build_search_results_page_collection(self, query: str, results: Dict[str, Any]) -> Page:
page = Page(
url=f'search_results_{query}',
@@ -338,8 +326,6 @@ class Browser:
find_page.lines = self._wrap_lines(find_page.text, 80)
return find_page
# ---- public API: search / open / find ------------------------------------
def search(self, *, query: str, topn: int = 5) -> Dict[str, Any]:
if not self._client:
raise RuntimeError('Client not provided')
-116
View File
@@ -1,116 +0,0 @@
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "mcp",
# "rich",
# "ollama",
# ]
# ///
"""
MCP stdio server exposing Ollama web_search and web_fetch as tools.
Environment:
- OLLAMA_API_KEY (required): if set, will be used as Authorization header.
"""
from __future__ import annotations
import asyncio
from typing import Any, Dict
from ollama import Client
try:
# Preferred high-level API (if available)
from mcp.server.fastmcp import FastMCP # type: ignore
_FASTMCP_AVAILABLE = True
except Exception:
_FASTMCP_AVAILABLE = False
if not _FASTMCP_AVAILABLE:
# Fallback to the low-level stdio server API
from mcp.server import Server # type: ignore
from mcp.server.stdio import stdio_server # type: ignore
client = Client()
def _web_search_impl(query: str, max_results: int = 3) -> Dict[str, Any]:
res = client.web_search(query=query, max_results=max_results)
return res.model_dump()
def _web_fetch_impl(url: str) -> Dict[str, Any]:
res = client.web_fetch(url=url)
return res.model_dump()
if _FASTMCP_AVAILABLE:
app = FastMCP('ollama-search-fetch')
@app.tool()
def web_search(query: str, max_results: int = 3) -> Dict[str, Any]:
"""
Perform a web search using Ollama's hosted search API.
Args:
query: The search query to run.
max_results: Maximum results to return (default: 3).
Returns:
JSON-serializable dict matching ollama.WebSearchResponse.model_dump()
"""
return _web_search_impl(query=query, max_results=max_results)
@app.tool()
def web_fetch(url: str) -> Dict[str, Any]:
"""
Fetch the content of a web page for the provided URL.
Args:
url: The absolute URL to fetch.
Returns:
JSON-serializable dict matching ollama.WebFetchResponse.model_dump()
"""
return _web_fetch_impl(url=url)
if __name__ == '__main__':
app.run()
else:
server = Server('ollama-search-fetch') # type: ignore[name-defined]
@server.tool() # type: ignore[attr-defined]
async def web_search(query: str, max_results: int = 3) -> Dict[str, Any]:
"""
Perform a web search using Ollama's hosted search API.
Args:
query: The search query to run.
max_results: Maximum results to return (default: 3).
"""
return await asyncio.to_thread(_web_search_impl, query, max_results)
@server.tool() # type: ignore[attr-defined]
async def web_fetch(url: str) -> Dict[str, Any]:
"""
Fetch the content of a web page for the provided URL.
Args:
url: The absolute URL to fetch.
"""
return await asyncio.to_thread(_web_fetch_impl, url)
async def _main() -> None:
async with stdio_server() as (read, write): # type: ignore[name-defined]
await server.run(read, write) # type: ignore[attr-defined]
if __name__ == '__main__':
asyncio.run(_main())
+1 -70
View File
@@ -1,4 +1,3 @@
import contextlib
import ipaddress
import json
import os
@@ -76,7 +75,7 @@ from ollama._types import (
T = TypeVar('T')
class BaseClient(contextlib.AbstractContextManager, contextlib.AbstractAsyncContextManager):
class BaseClient:
def __init__(
self,
client,
@@ -117,12 +116,6 @@ class BaseClient(contextlib.AbstractContextManager, contextlib.AbstractAsyncCont
**kwargs,
)
def __exit__(self, exc_type, exc_val, exc_tb):
self.close()
async def __aexit__(self, exc_type, exc_val, exc_tb):
await self.close()
CONNECTION_ERROR_MESSAGE = 'Failed to connect to Ollama. Please check that Ollama is downloaded, running and accessible. https://ollama.com/download'
@@ -131,9 +124,6 @@ class Client(BaseClient):
def __init__(self, host: Optional[str] = None, **kwargs) -> None:
super().__init__(httpx.Client, host, **kwargs)
def close(self):
self._client.close()
def _request_raw(self, *args, **kwargs):
try:
r = self._client.request(*args, **kwargs)
@@ -210,16 +200,11 @@ class Client(BaseClient):
context: Optional[Sequence[int]] = None,
stream: Literal[False] = False,
think: Optional[bool] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: bool = False,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> GenerateResponse: ...
@overload
@@ -234,16 +219,11 @@ class Client(BaseClient):
context: Optional[Sequence[int]] = None,
stream: Literal[True] = True,
think: Optional[bool] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: bool = False,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> Iterator[GenerateResponse]: ...
def generate(
@@ -257,16 +237,11 @@ class Client(BaseClient):
context: Optional[Sequence[int]] = None,
stream: bool = False,
think: Optional[bool] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: Optional[bool] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> Union[GenerateResponse, Iterator[GenerateResponse]]:
"""
Create a response using the requested model.
@@ -291,16 +266,11 @@ class Client(BaseClient):
context=context,
stream=stream,
think=think,
logprobs=logprobs,
top_logprobs=top_logprobs,
raw=raw,
format=format,
images=list(_copy_images(images)) if images else None,
options=options,
keep_alive=keep_alive,
width=width,
height=height,
steps=steps,
).model_dump(exclude_none=True),
stream=stream,
)
@@ -314,8 +284,6 @@ class Client(BaseClient):
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: Literal[False] = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
@@ -330,8 +298,6 @@ class Client(BaseClient):
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: Literal[True] = True,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
@@ -345,8 +311,6 @@ class Client(BaseClient):
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: bool = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
@@ -394,8 +358,6 @@ class Client(BaseClient):
tools=list(_copy_tools(tools)),
stream=stream,
think=think,
logprobs=logprobs,
top_logprobs=top_logprobs,
format=format,
options=options,
keep_alive=keep_alive,
@@ -724,9 +686,6 @@ class AsyncClient(BaseClient):
def __init__(self, host: Optional[str] = None, **kwargs) -> None:
super().__init__(httpx.AsyncClient, host, **kwargs)
async def close(self):
await self._client.aclose()
async def _request_raw(self, *args, **kwargs):
try:
r = await self._client.request(*args, **kwargs)
@@ -843,16 +802,11 @@ class AsyncClient(BaseClient):
context: Optional[Sequence[int]] = None,
stream: Literal[False] = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: bool = False,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> GenerateResponse: ...
@overload
@@ -867,16 +821,11 @@ class AsyncClient(BaseClient):
context: Optional[Sequence[int]] = None,
stream: Literal[True] = True,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: bool = False,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> AsyncIterator[GenerateResponse]: ...
async def generate(
@@ -890,16 +839,11 @@ class AsyncClient(BaseClient):
context: Optional[Sequence[int]] = None,
stream: bool = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
raw: Optional[bool] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
images: Optional[Sequence[Union[str, bytes, Image]]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
width: Optional[int] = None,
height: Optional[int] = None,
steps: Optional[int] = None,
) -> Union[GenerateResponse, AsyncIterator[GenerateResponse]]:
"""
Create a response using the requested model.
@@ -923,16 +867,11 @@ class AsyncClient(BaseClient):
context=context,
stream=stream,
think=think,
logprobs=logprobs,
top_logprobs=top_logprobs,
raw=raw,
format=format,
images=list(_copy_images(images)) if images else None,
options=options,
keep_alive=keep_alive,
width=width,
height=height,
steps=steps,
).model_dump(exclude_none=True),
stream=stream,
)
@@ -946,8 +885,6 @@ class AsyncClient(BaseClient):
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: Literal[False] = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
@@ -962,8 +899,6 @@ class AsyncClient(BaseClient):
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: Literal[True] = True,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
@@ -977,8 +912,6 @@ class AsyncClient(BaseClient):
tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
stream: bool = False,
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
logprobs: Optional[bool] = None,
top_logprobs: Optional[int] = None,
format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
options: Optional[Union[Mapping[str, Any], Options]] = None,
keep_alive: Optional[Union[float, str]] = None,
@@ -1027,8 +960,6 @@ class AsyncClient(BaseClient):
tools=list(_copy_tools(tools)),
stream=stream,
think=think,
logprobs=logprobs,
top_logprobs=top_logprobs,
format=format,
options=options,
keep_alive=keep_alive,
+1 -53
View File
@@ -210,22 +210,6 @@ class GenerateRequest(BaseGenerateRequest):
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None
'Enable thinking mode (for thinking models).'
logprobs: Optional[bool] = None
'Return log probabilities for generated tokens.'
top_logprobs: Optional[int] = None
'Number of alternative tokens and log probabilities to include per position (0-20).'
# Experimental image generation parameters
width: Optional[int] = None
'Width of the generated image in pixels (for image generation models).'
height: Optional[int] = None
'Height of the generated image in pixels (for image generation models).'
steps: Optional[int] = None
'Number of diffusion steps (for image generation models).'
class BaseGenerateResponse(SubscriptableBaseModel):
model: Optional[str] = None
@@ -259,25 +243,12 @@ class BaseGenerateResponse(SubscriptableBaseModel):
'Duration of evaluating inference in nanoseconds.'
class TokenLogprob(SubscriptableBaseModel):
token: str
'Token text.'
logprob: float
'Log probability for the token.'
class Logprob(TokenLogprob):
top_logprobs: Optional[Sequence[TokenLogprob]] = None
'Most likely tokens and their log probabilities.'
class GenerateResponse(BaseGenerateResponse):
"""
Response returned by generate requests.
"""
response: Optional[str] = None
response: str
'Response content. When streaming, this contains a fragment of the response.'
thinking: Optional[str] = None
@@ -286,20 +257,6 @@ class GenerateResponse(BaseGenerateResponse):
context: Optional[Sequence[int]] = None
'Tokenized history up to the point of the response.'
logprobs: Optional[Sequence[Logprob]] = None
'Log probabilities for generated tokens.'
# Image generation response fields
image: Optional[str] = None
'Base64-encoded generated image data (for image generation models).'
# Streaming progress fields (for image generation)
completed: Optional[int] = None
'Number of completed steps (for image generation streaming).'
total: Optional[int] = None
'Total number of steps (for image generation streaming).'
class Message(SubscriptableBaseModel):
"""
@@ -403,12 +360,6 @@ class ChatRequest(BaseGenerateRequest):
think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None
'Enable thinking mode (for thinking models).'
logprobs: Optional[bool] = None
'Return log probabilities for generated tokens.'
top_logprobs: Optional[int] = None
'Number of alternative tokens and log probabilities to include per position (0-20).'
class ChatResponse(BaseGenerateResponse):
"""
@@ -418,9 +369,6 @@ class ChatResponse(BaseGenerateResponse):
message: Message
'Response message.'
logprobs: Optional[Sequence[Logprob]] = None
'Log probabilities for generated tokens if requested.'
class EmbedRequest(BaseRequest):
input: Union[str, Sequence[str]]
+1 -1
View File
@@ -37,7 +37,7 @@ dependencies = [ 'ruff>=0.9.1' ]
config-path = 'none'
[tool.ruff]
line-length = 320
line-length = 999
indent-width = 2
[tool.ruff.format]
-211
View File
@@ -61,44 +61,6 @@ def test_client_chat(httpserver: HTTPServer):
assert response['message']['content'] == "I don't know."
def test_client_chat_with_logprobs(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/chat',
method='POST',
json={
'model': 'dummy',
'messages': [{'role': 'user', 'content': 'Hi'}],
'tools': [],
'stream': False,
'logprobs': True,
'top_logprobs': 3,
},
).respond_with_json(
{
'model': 'dummy',
'message': {
'role': 'assistant',
'content': 'Hello',
},
'logprobs': [
{
'token': 'Hello',
'logprob': -0.1,
'top_logprobs': [
{'token': 'Hello', 'logprob': -0.1},
{'token': 'Hi', 'logprob': -1.0},
],
}
],
}
)
client = Client(httpserver.url_for('/'))
response = client.chat('dummy', messages=[{'role': 'user', 'content': 'Hi'}], logprobs=True, top_logprobs=3)
assert response['logprobs'][0]['token'] == 'Hello'
assert response['logprobs'][0]['top_logprobs'][1]['token'] == 'Hi'
def test_client_chat_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
@@ -332,40 +294,6 @@ def test_client_generate(httpserver: HTTPServer):
assert response['response'] == 'Because it is.'
def test_client_generate_with_logprobs(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy',
'prompt': 'Why',
'stream': False,
'logprobs': True,
'top_logprobs': 2,
},
).respond_with_json(
{
'model': 'dummy',
'response': 'Hello',
'logprobs': [
{
'token': 'Hello',
'logprob': -0.2,
'top_logprobs': [
{'token': 'Hello', 'logprob': -0.2},
{'token': 'Hi', 'logprob': -1.5},
],
}
],
}
)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy', 'Why', logprobs=True, top_logprobs=2)
assert response['logprobs'][0]['token'] == 'Hello'
assert response['logprobs'][0]['top_logprobs'][1]['token'] == 'Hi'
def test_client_generate_with_image_type(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
@@ -568,115 +496,6 @@ async def test_async_client_generate_format_pydantic(httpserver: HTTPServer):
assert response['response'] == '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}'
def test_client_generate_image(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy-image',
'prompt': 'a sunset over mountains',
'stream': False,
'width': 1024,
'height': 768,
'steps': 20,
},
).respond_with_json(
{
'model': 'dummy-image',
'image': PNG_BASE64,
'done': True,
'done_reason': 'stop',
}
)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy-image', 'a sunset over mountains', width=1024, height=768, steps=20)
assert response['model'] == 'dummy-image'
assert response['image'] == PNG_BASE64
assert response['done'] is True
def test_client_generate_image_stream(httpserver: HTTPServer):
def stream_handler(_: Request):
def generate():
# Progress updates
for i in range(1, 4):
yield (
json.dumps(
{
'model': 'dummy-image',
'completed': i,
'total': 3,
'done': False,
}
)
+ '\n'
)
# Final response with image
yield (
json.dumps(
{
'model': 'dummy-image',
'image': PNG_BASE64,
'done': True,
'done_reason': 'stop',
}
)
+ '\n'
)
return Response(generate())
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy-image',
'prompt': 'a sunset over mountains',
'stream': True,
'width': 512,
'height': 512,
},
).respond_with_handler(stream_handler)
client = Client(httpserver.url_for('/'))
response = client.generate('dummy-image', 'a sunset over mountains', stream=True, width=512, height=512)
parts = list(response)
# Check progress updates
assert parts[0]['completed'] == 1
assert parts[0]['total'] == 3
assert parts[0]['done'] is False
# Check final response
assert parts[-1]['image'] == PNG_BASE64
assert parts[-1]['done'] is True
async def test_async_client_generate_image(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/generate',
method='POST',
json={
'model': 'dummy-image',
'prompt': 'a robot painting',
'stream': False,
'width': 1024,
'height': 1024,
},
).respond_with_json(
{
'model': 'dummy-image',
'image': PNG_BASE64,
'done': True,
}
)
client = AsyncClient(httpserver.url_for('/'))
response = await client.generate('dummy-image', 'a robot painting', width=1024, height=1024)
assert response['model'] == 'dummy-image'
assert response['image'] == PNG_BASE64
def test_client_pull(httpserver: HTTPServer):
httpserver.expect_ordered_request(
'/api/pull',
@@ -1456,33 +1275,3 @@ def test_client_explicit_bearer_header_overrides_env(monkeypatch: pytest.MonkeyP
client = Client(headers={'Authorization': 'Bearer explicit-token'})
assert client._client.headers['authorization'] == 'Bearer explicit-token'
client.web_search('override check')
def test_client_close():
client = Client()
client.close()
assert client._client.is_closed
@pytest.mark.anyio
async def test_async_client_close():
client = AsyncClient()
await client.close()
assert client._client.is_closed
def test_client_context_manager():
with Client() as client:
assert isinstance(client, Client)
assert not client._client.is_closed
assert client._client.is_closed
@pytest.mark.anyio
async def test_async_client_context_manager():
async with AsyncClient() as client:
assert isinstance(client, AsyncClient)
assert not client._client.is_closed
assert client._client.is_closed