removed comments

remove unused import
formatting fix
2026-06-15 12:44:50 +00:00 · 2025-09-24 11:49:19 -07:00 · 2025-09-24 11:44:55 -07:00 · 2025-09-24 11:43:33 -07:00 · 2025-09-23 21:34:22 -07:00 · 2025-09-23 18:07:53 -07:00
18 changed files with 445 additions and 714 deletions
@@ -13,7 +13,7 @@ jobs:
      id-token: write
      contents: write
    steps:
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v5
      - uses: actions/setup-python@v6
      - uses: astral-sh/setup-uv@v5
        with:
@@ -10,7 +10,7 @@ jobs:
  test:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v5
      - uses: astral-sh/setup-uv@v5
        with:
          enable-cache: true
@@ -19,7 +19,7 @@ jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v5
      - uses: actions/setup-python@v6
      - uses: astral-sh/setup-uv@v5
        with:
@@ -50,82 +50,6 @@ for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)
 ```

-## Cloud Models
-
-Run larger models by offloading to Ollama’s cloud while keeping your local workflow.
-
- Supported models: `deepseek-v3.1:671b-cloud`, `gpt-oss:20b-cloud`, `gpt-oss:120b-cloud`, `kimi-k2:1t-cloud`, `qwen3-coder:480b-cloud`, `kimi-k2-thinking` See [Ollama Models - Cloud](https://ollama.com/search?c=cloud) for more information
-
-### Run via local Ollama
-
-1) Sign in (one-time):
-
-```
-ollama signin
-```
-
-2) Pull a cloud model:
-
-```
-ollama pull gpt-oss:120b-cloud
-```
-
-3) Make a request:
-
-```python
-from ollama import Client
-
-client = Client()
-
-messages = [
-  {
-    'role': 'user',
-    'content': 'Why is the sky blue?',
-  },
-]
-
-for part in client.chat('gpt-oss:120b-cloud', messages=messages, stream=True):
-  print(part.message.content, end='', flush=True)
-```
-
-### Cloud API (ollama.com)
-
-Access cloud models directly by pointing the client at `https://ollama.com`.
-
-1) Create an API key from [ollama.com](https://ollama.com/settings/keys) , then set:
-
-```
-export OLLAMA_API_KEY=your_api_key
-```
-
-2) (Optional) List models available via the API:
-
-```
-curl https://ollama.com/api/tags
-```
-
-3) Generate a response via the cloud API:
-
-```python
-import os
-from ollama import Client
-
-client = Client(
-    host='https://ollama.com',
-    headers={'Authorization': 'Bearer ' + os.environ.get('OLLAMA_API_KEY')}
-)
-
-messages = [
-  {
-    'role': 'user',
-    'content': 'Why is the sky blue?',
-  },
-]
-
-for part in client.chat('gpt-oss:120b', messages=messages, stream=True):
-  print(part.message.content, end='', flush=True)
-```
-
 ## Custom client
 A custom client can be created by instantiating `Client` or `AsyncClient` from `ollama`.

@@ -250,6 +174,7 @@ ollama.embed(model='gemma3', input=['The sky is blue because of rayleigh scatter
 ollama.ps()
 ```

+
 ## Errors

 Errors are raised if requests return an error status or if an error is detected while streaming.
@@ -1,129 +1,80 @@
 # Running Examples

 Run the examples in this directory with:
-
 ```sh
 # Run example
 python3 examples/<example>.py
-
-# or with uv
-uv run examples/<example>.py
 ```

 See [ollama/docs/api.md](https://github.com/ollama/ollama/blob/main/docs/api.md) for full API documentation

 ### Chat - Chat with a model
-
 - [chat.py](chat.py)
 - [async-chat.py](async-chat.py)
 - [chat-stream.py](chat-stream.py) - Streamed outputs
 - [chat-with-history.py](chat-with-history.py) - Chat with model and maintain history of the conversation

-### Generate - Generate text with a model

+### Generate - Generate text with a model
 - [generate.py](generate.py)
 - [async-generate.py](async-generate.py)
 - [generate-stream.py](generate-stream.py) - Streamed outputs
 - [fill-in-middle.py](fill-in-middle.py) - Given a prefix and suffix, fill in the middle

-### Tools/Function Calling - Call a function with a model

+### Tools/Function Calling - Call a function with a model
 - [tools.py](tools.py) - Simple example of Tools/Function Calling
 - [async-tools.py](async-tools.py)
 - [multi-tool.py](multi-tool.py) - Using multiple tools, with thinking enabled

-#### gpt-oss
-
+ #### gpt-oss
 - [gpt-oss-tools.py](gpt-oss-tools.py)
- [gpt-oss-tools-stream.py](gpt-oss-tools-stream.py)
+- [gpt-oss-tools-stream.py](gpt-oss-tools-stream.py) 
+- [gpt-oss-tools-browser.py](gpt-oss-tools-browser.py) - Using browser research tools with gpt-oss
+- [gpt-oss-tools-browser-stream.py](gpt-oss-tools-browser-stream.py) - Using browser research tools with gpt-oss, with streaming enabled

-### Web search
-
-An API key from Ollama's cloud service is required. You can create one [here](https://ollama.com/settings/keys).
-
-```shell
-export OLLAMA_API_KEY="your_api_key_here"
-```
-
- [web-search.py](web-search.py)
- [web-search-gpt-oss.py](web-search-gpt-oss.py) - Using browser research tools with gpt-oss
-
-#### MCP server
-
-The MCP server can be used with an MCP client like Cursor, Cline, Codex, Open WebUI, Goose, and more.
-
-```sh
-uv run examples/web-search-mcp.py
-```
-
-Configuration to use with an MCP client:
-
-```json
-{
-  "mcpServers": {
-    "web_search": {
-      "type": "stdio",
-      "command": "uv",
-      "args": ["run", "path/to/ollama-python/examples/web-search-mcp.py"],
-      "env": { "OLLAMA_API_KEY": "your_api_key_here" }
-    }
-  }
-}
-```
-
- [web-search-mcp.py](web-search-mcp.py)

 ### Multimodal with Images - Chat with a multimodal (image chat) model
-
 - [multimodal-chat.py](multimodal-chat.py)
 - [multimodal-generate.py](multimodal-generate.py)

-### Image Generation (Experimental) - Generate images with a model
-
-> **Note:** Image generation is experimental and currently only available on macOS.
-
- [generate-image.py](generate-image.py)

 ### Structured Outputs - Generate structured outputs with a model
-
 - [structured-outputs.py](structured-outputs.py)
 - [async-structured-outputs.py](async-structured-outputs.py)
 - [structured-outputs-image.py](structured-outputs-image.py)

-### Ollama List - List all downloaded models and their properties

+### Ollama List - List all downloaded models and their properties
 - [list.py](list.py)

-### Ollama Show - Display model properties and capabilities

+### Ollama Show - Display model properties and capabilities
 - [show.py](show.py)

-### Ollama ps - Show model status with CPU/GPU usage

+### Ollama ps - Show model status with CPU/GPU usage
 - [ps.py](ps.py)

+
 ### Ollama Pull - Pull a model from Ollama
-
 Requirement: `pip install tqdm`
+- [pull.py](pull.py) 

- [pull.py](pull.py)

 ### Ollama Create - Create a model from a Modelfile
+- [create.py](create.py) 

- [create.py](create.py)

 ### Ollama Embed - Generate embeddings with a model
-
 - [embed.py](embed.py)

-### Thinking - Enable thinking mode for a model

+### Thinking - Enable thinking mode for a model
 - [thinking.py](thinking.py)

 ### Thinking (generate) - Enable thinking mode for a model
-
 - [thinking-generate.py](thinking-generate.py)

 ### Thinking (levels) - Choose the thinking level
-
 - [thinking-levels.py](thinking-levels.py)
@@ -1,31 +0,0 @@
-from typing import Iterable
-
-import ollama
-
-
-def print_logprobs(logprobs: Iterable[dict], label: str) -> None:
-  print(f'\n{label}:')
-  for entry in logprobs:
-    token = entry.get('token', '')
-    logprob = entry.get('logprob')
-    print(f'  token={token!r:<12} logprob={logprob:.3f}')
-    for alt in entry.get('top_logprobs', []):
-      if alt['token'] != token:
-        print(f'    alt -> {alt["token"]!r:<12} ({alt["logprob"]:.3f})')
-
-
-messages = [
-  {
-    'role': 'user',
-    'content': 'hi! be concise.',
-  },
-]
-
-response = ollama.chat(
-  model='gemma3',
-  messages=messages,
-  logprobs=True,
-  top_logprobs=3,
-)
-print('Chat response:', response['message']['content'])
-print_logprobs(response.get('logprobs', []), 'chat logprobs')
@@ -15,8 +15,7 @@ messages = [
  },
  {
    'role': 'assistant',
-    'content': """The weather in Tokyo is typically warm and humid during the summer months, with temperatures often exceeding 30°C (86°F). The city experiences a rainy season from June to September, with heavy rainfall and occasional typhoons. Winter is mild, with temperatures
-    rarely dropping below freezing. The city is known for its high-tech and vibrant culture, with many popular tourist attractions such as the Tokyo Tower, Senso-ji Temple, and the bustling Shibuya district.""",
+    'content': 'The weather in Tokyo is typically warm and humid during the summer months, with temperatures often exceeding 30°C (86°F). The city experiences a rainy season from June to September, with heavy rainfall and occasional typhoons. Winter is mild, with temperatures rarely dropping below freezing. The city is known for its high-tech and vibrant culture, with many popular tourist attractions such as the Tokyo Tower, Senso-ji Temple, and the bustling Shibuya district.',
  },
 ]

@@ -1,18 +0,0 @@
-# Image generation is experimental and currently only available on macOS
-
-import base64
-
-from ollama import generate
-
-prompt = 'a sunset over mountains'
-print(f'Prompt: {prompt}')
-
-for response in generate(model='x/z-image-turbo', prompt=prompt, stream=True):
-  if response.image:
-    # Final response contains the image
-    with open('output.png', 'wb') as f:
-      f.write(base64.b64decode(response.image))
-    print('\nImage saved to output.png')
-  elif response.total:
-    # Progress update
-    print(f'Progress: {response.completed or 0}/{response.total}', end='\r')
@@ -1,24 +0,0 @@
-from typing import Iterable
-
-import ollama
-
-
-def print_logprobs(logprobs: Iterable[dict], label: str) -> None:
-  print(f'\n{label}:')
-  for entry in logprobs:
-    token = entry.get('token', '')
-    logprob = entry.get('logprob')
-    print(f'  token={token!r:<12} logprob={logprob:.3f}')
-    for alt in entry.get('top_logprobs', []):
-      if alt['token'] != token:
-        print(f'    alt -> {alt["token"]!r:<12} ({alt["logprob"]:.3f})')
-
-
-response = ollama.generate(
-  model='gemma3',
-  prompt='hi! be concise.',
-  logprobs=True,
-  top_logprobs=3,
-)
-print('Generate response:', response['response'])
-print_logprobs(response.get('logprobs', []), 'generate logprobs')
@@ -1,12 +1,8 @@
-# /// script
-# requires-python = ">=3.11"
-# dependencies = [
-#     "ollama",
-# ]
-# ///
+from __future__ import annotations
+
 from typing import Any, Dict, List

-from web_search_gpt_oss_helper import Browser
+from gpt_oss_browser_tool_helper import Browser

 from ollama import Client

@@ -15,6 +11,52 @@ def main() -> None:
  client = Client()
  browser = Browser(initial_state=None, client=client)

+  browser_search_schema = {
+    'type': 'function',
+    'function': {
+      'name': 'browser.search',
+      'parameters': {
+        'type': 'object',
+        'properties': {
+          'query': {'type': 'string'},
+          'topn': {'type': 'integer'},
+        },
+        'required': ['query'],
+      },
+    },
+  }
+
+  browser_open_schema = {
+    'type': 'function',
+    'function': {
+      'name': 'browser.open',
+      'parameters': {
+        'type': 'object',
+        'properties': {
+          'id': {'anyOf': [{'type': 'integer'}, {'type': 'string'}]},
+          'cursor': {'type': 'integer'},
+          'loc': {'type': 'integer'},
+          'num_lines': {'type': 'integer'},
+        },
+      },
+    },
+  }
+
+  browser_find_schema = {
+    'type': 'function',
+    'function': {
+      'name': 'browser.find',
+      'parameters': {
+        'type': 'object',
+        'properties': {
+          'pattern': {'type': 'string'},
+          'cursor': {'type': 'integer'},
+        },
+        'required': ['pattern'],
+      },
+    },
+  }
+
  def browser_search(query: str, topn: int = 10) -> str:
    return browser.search(query=query, topn=topn)['pageText']

@@ -24,41 +66,19 @@ def main() -> None:
  def browser_find(pattern: str, cursor: int = -1, **_: Any) -> str:
    return browser.find(pattern=pattern, cursor=cursor)['pageText']

-  browser_search_schema = {
-    'type': 'function',
-    'function': {
-      'name': 'browser.search',
-    },
-  }
-
-  browser_open_schema = {
-    'type': 'function',
-    'function': {
-      'name': 'browser.open',
-    },
-  }
-
-  browser_find_schema = {
-    'type': 'function',
-    'function': {
-      'name': 'browser.find',
-    },
-  }
-
  available_tools = {
    'browser.search': browser_search,
    'browser.open': browser_open,
    'browser.find': browser_find,
  }
-
-  query = "what is ollama's new engine"
+  query = 'What is Ollama.com?'
  print('Prompt:', query, '\n')

  messages: List[Dict[str, Any]] = [{'role': 'user', 'content': query}]

  while True:
    resp = client.chat(
-      model='gpt-oss:120b-cloud',
+      model='gpt-oss',
      messages=messages,
      tools=[browser_search_schema, browser_open_schema, browser_find_schema],
      think=True,
@@ -80,7 +100,6 @@ def main() -> None:
    for tc in resp.message.tool_calls:
      tool_name = tc.function.name
      args = tc.function.arguments or {}
-      print(f'Tool name: {tool_name}, args: {args}')
      fn = available_tools.get(tool_name)
      if not fn:
        messages.append({'role': 'tool', 'content': f'Tool {tool_name} not found', 'tool_name': tool_name})
@@ -88,7 +107,6 @@ def main() -> None:

      try:
        result_text = fn(**args)
-        print('Result: ', result_text[:200] + '...')
      except Exception as e:
        result_text = f'Error from {tool_name}: {e}'

@@ -0,0 +1,198 @@
+# /// script
+# requires-python = ">=3.11"
+# dependencies = [
+#     "gpt-oss",
+#     "ollama",
+#     "rich",
+# ]
+# ///
+
+import asyncio
+import json
+from typing import Iterator, Optional
+
+from gpt_oss.tools.simple_browser import ExaBackend, SimpleBrowserTool
+from openai_harmony import Author, Role, TextContent
+from openai_harmony import Message as HarmonyMessage
+from rich import print
+
+from ollama import Client
+from ollama._types import ChatResponse
+
+_backend = ExaBackend(source='web')
+_browser_tool = SimpleBrowserTool(backend=_backend)
+
+
+def heading(text):
+  print(text)
+  print('=' * (len(text) + 3))
+
+
+async def _browser_search_async(query: str, topn: int = 10, source: str | None = None) -> str:
+  # map Ollama message to Harmony format
+  harmony_message = HarmonyMessage(
+    author=Author(role=Role.USER),
+    content=[TextContent(text=json.dumps({'query': query, 'topn': topn}))],
+    recipient='browser.search',
+  )
+
+  result_text: str = ''
+  async for response in _browser_tool._process(harmony_message):
+    if response.content:
+      for content in response.content:
+        if isinstance(content, TextContent):
+          result_text += content.text
+  return result_text or f'No results for query: {query}'
+
+
+async def _browser_open_async(id: int | str = -1, cursor: int = -1, loc: int = -1, num_lines: int = -1, *, view_source: bool = False, source: str | None = None) -> str:
+  payload = {'id': id, 'cursor': cursor, 'loc': loc, 'num_lines': num_lines, 'view_source': view_source, 'source': source}
+
+  harmony_message = HarmonyMessage(
+    author=Author(role=Role.USER),
+    content=[TextContent(text=json.dumps(payload))],
+    recipient='browser.open',
+  )
+
+  result_text: str = ''
+  async for response in _browser_tool._process(harmony_message):
+    if response.content:
+      for content in response.content:
+        if isinstance(content, TextContent):
+          result_text += content.text
+  return result_text or f'Could not open: {id}'
+
+
+async def _browser_find_async(pattern: str, cursor: int = -1) -> str:
+  payload = {'pattern': pattern, 'cursor': cursor}
+
+  harmony_message = HarmonyMessage(
+    author=Author(role=Role.USER),
+    content=[TextContent(text=json.dumps(payload))],
+    recipient='browser.find',
+  )
+
+  result_text: str = ''
+  async for response in _browser_tool._process(harmony_message):
+    if response.content:
+      for content in response.content:
+        if isinstance(content, TextContent):
+          result_text += content.text
+  return result_text or f'Pattern not found: {pattern}'
+
+
+def browser_search(query: str, topn: int = 10, source: Optional[str] = None) -> str:
+  return asyncio.run(_browser_search_async(query=query, topn=topn, source=source))
+
+
+def browser_open(id: int | str = -1, cursor: int = -1, loc: int = -1, num_lines: int = -1, *, view_source: bool = False, source: Optional[str] = None) -> str:
+  return asyncio.run(_browser_open_async(id=id, cursor=cursor, loc=loc, num_lines=num_lines, view_source=view_source, source=source))
+
+
+def browser_find(pattern: str, cursor: int = -1) -> str:
+  return asyncio.run(_browser_find_async(pattern=pattern, cursor=cursor))
+
+
+# Schema definitions for each browser tool
+browser_search_schema = {
+  'type': 'function',
+  'function': {
+    'name': 'browser.search',
+  },
+}
+
+browser_open_schema = {
+  'type': 'function',
+  'function': {
+    'name': 'browser.open',
+  },
+}
+
+browser_find_schema = {
+  'type': 'function',
+  'function': {
+    'name': 'browser.find',
+  },
+}
+
+available_tools = {
+  'browser.search': browser_search,
+  'browser.open': browser_open,
+  'browser.find': browser_find,
+}
+
+
+model = 'gpt-oss:20b'
+print('Model: ', model, '\n')
+
+prompt = 'What is Ollama?'
+print('You: ', prompt, '\n')
+messages = [{'role': 'user', 'content': prompt}]
+
+client = Client()
+
+# gpt-oss can call tools while "thinking"
+# a loop is needed to call the tools and get the results
+final = True
+while True:
+  response_stream: Iterator[ChatResponse] = client.chat(
+    model=model,
+    messages=messages,
+    tools=[browser_search_schema, browser_open_schema, browser_find_schema],
+    options={'num_ctx': 8192},  # 8192 is the recommended lower limit for the context window
+    stream=True,
+  )
+
+  tool_calls = []
+  thinking = ''
+  content = ''
+
+  for chunk in response_stream:
+    if chunk.message.tool_calls:
+      tool_calls.extend(chunk.message.tool_calls)
+
+    if chunk.message.content:
+      if not (chunk.message.thinking or chunk.message.thinking == '') and final:
+        heading('\n\nFinal result: ')
+        final = False
+      print(chunk.message.content, end='', flush=True)
+
+    if chunk.message.thinking:
+      thinking += chunk.message.thinking
+      print(chunk.message.thinking, end='', flush=True)
+
+  if thinking != '':
+    messages.append({'role': 'assistant', 'content': thinking, 'tool_calls': tool_calls})
+
+  print()
+
+  if tool_calls:
+    for tool_call in tool_calls:
+      tool_name = tool_call.function.name
+      args = tool_call.function.arguments or {}
+      function_to_call = available_tools.get(tool_name)
+
+      if function_to_call:
+        heading(f'\nCalling tool: {tool_name}')
+        if args:
+          print(f'Arguments: {args}')
+
+        try:
+          result = function_to_call(**args)
+          print(f'Tool result: {result[:200]}')
+          if len(result) > 200:
+            heading('... [truncated]')
+          print()
+
+          result_message = {'role': 'tool', 'content': result, 'tool_name': tool_name}
+          messages.append(result_message)
+
+        except Exception as e:
+          err = f'Error from {tool_name}: {e}'
+          print(err)
+          messages.append({'role': 'tool', 'content': err, 'tool_name': tool_name})
+      else:
+        print(f'Tool {tool_name} not found')
+  else:
+    # no more tool calls, we can stop the loop
+    break
@@ -0,0 +1,175 @@
+# /// script
+# requires-python = ">=3.11"
+# dependencies = [
+#     "gpt-oss",
+#     "ollama",
+#     "rich",
+# ]
+# ///
+
+import asyncio
+import json
+from typing import Optional
+
+from gpt_oss.tools.simple_browser import ExaBackend, SimpleBrowserTool
+from openai_harmony import Author, Role, TextContent
+from openai_harmony import Message as HarmonyMessage
+
+from ollama import Client
+
+_backend = ExaBackend(source='web')
+_browser_tool = SimpleBrowserTool(backend=_backend)
+
+
+def heading(text):
+  print(text)
+  print('=' * (len(text) + 3))
+
+
+async def _browser_search_async(query: str, topn: int = 10, source: str | None = None) -> str:
+  # map Ollama message to Harmony format
+  harmony_message = HarmonyMessage(
+    author=Author(role=Role.USER),
+    content=[TextContent(text=json.dumps({'query': query, 'topn': topn}))],
+    recipient='browser.search',
+  )
+
+  result_text: str = ''
+  async for response in _browser_tool._process(harmony_message):
+    if response.content:
+      for content in response.content:
+        if isinstance(content, TextContent):
+          result_text += content.text
+  return result_text or f'No results for query: {query}'
+
+
+async def _browser_open_async(id: int | str = -1, cursor: int = -1, loc: int = -1, num_lines: int = -1, *, view_source: bool = False, source: str | None = None) -> str:
+  payload = {'id': id, 'cursor': cursor, 'loc': loc, 'num_lines': num_lines, 'view_source': view_source, 'source': source}
+
+  harmony_message = HarmonyMessage(
+    author=Author(role=Role.USER),
+    content=[TextContent(text=json.dumps(payload))],
+    recipient='browser.open',
+  )
+
+  result_text: str = ''
+  async for response in _browser_tool._process(harmony_message):
+    if response.content:
+      for content in response.content:
+        if isinstance(content, TextContent):
+          result_text += content.text
+  return result_text or f'Could not open: {id}'
+
+
+async def _browser_find_async(pattern: str, cursor: int = -1) -> str:
+  payload = {'pattern': pattern, 'cursor': cursor}
+
+  harmony_message = HarmonyMessage(
+    author=Author(role=Role.USER),
+    content=[TextContent(text=json.dumps(payload))],
+    recipient='browser.find',
+  )
+
+  result_text: str = ''
+  async for response in _browser_tool._process(harmony_message):
+    if response.content:
+      for content in response.content:
+        if isinstance(content, TextContent):
+          result_text += content.text
+  return result_text or f'Pattern not found: {pattern}'
+
+
+def browser_search(query: str, topn: int = 10, source: Optional[str] = None) -> str:
+  return asyncio.run(_browser_search_async(query=query, topn=topn, source=source))
+
+
+def browser_open(id: int | str = -1, cursor: int = -1, loc: int = -1, num_lines: int = -1, *, view_source: bool = False, source: Optional[str] = None) -> str:
+  return asyncio.run(_browser_open_async(id=id, cursor=cursor, loc=loc, num_lines=num_lines, view_source=view_source, source=source))
+
+
+def browser_find(pattern: str, cursor: int = -1) -> str:
+  return asyncio.run(_browser_find_async(pattern=pattern, cursor=cursor))
+
+
+# Schema definitions for each browser tool
+browser_search_schema = {
+  'type': 'function',
+  'function': {
+    'name': 'browser.search',
+  },
+}
+
+browser_open_schema = {
+  'type': 'function',
+  'function': {
+    'name': 'browser.open',
+  },
+}
+
+browser_find_schema = {
+  'type': 'function',
+  'function': {
+    'name': 'browser.find',
+  },
+}
+
+available_tools = {
+  'browser.search': browser_search,
+  'browser.open': browser_open,
+  'browser.find': browser_find,
+}
+
+
+model = 'gpt-oss:20b'
+print('Model: ', model, '\n')
+
+prompt = 'What is Ollama?'
+print('You: ', prompt, '\n')
+messages = [{'role': 'user', 'content': prompt}]
+
+client = Client()
+while True:
+  response = client.chat(
+    model=model,
+    messages=messages,
+    tools=[browser_search_schema, browser_open_schema, browser_find_schema],
+    options={'num_ctx': 8192},  # 8192 is the recommended lower limit for the context window
+  )
+
+  if hasattr(response.message, 'thinking') and response.message.thinking:
+    heading('Thinking')
+    print(response.message.thinking.strip() + '\n')
+
+  if hasattr(response.message, 'content') and response.message.content:
+    heading('Assistant')
+    print(response.message.content.strip() + '\n')
+
+  # add message to chat history
+  messages.append(response.message)
+
+  if response.message.tool_calls:
+    for tool_call in response.message.tool_calls:
+      tool_name = tool_call.function.name
+      args = tool_call.function.arguments or {}
+      function_to_call = available_tools.get(tool_name)
+      if not function_to_call:
+        print(f'Unknown tool: {tool_name}')
+        continue
+
+      try:
+        result = function_to_call(**args)
+        heading(f'Tool: {tool_name}')
+        if args:
+          print(f'Arguments: {args}')
+        print(result[:200])
+        if len(result) > 200:
+          print('... [truncated]')
+        print()
+        messages.append({'role': 'tool', 'content': result, 'tool_name': tool_name})
+      except Exception as e:
+        err = f'Error from {tool_name}: {e}'
+        print(err)
+        messages.append({'role': 'tool', 'content': err, 'tool_name': tool_name})
+  else:
+    # break on no more tool calls
+    break
@@ -41,13 +41,9 @@ class CrawlClient(Protocol):
  def crawl(self, urls: List[str]): ...


-# ---- Constants ---------------------------------------------------------------
-
 DEFAULT_VIEW_TOKENS = 1024
 CAPPED_TOOL_CONTENT_LEN = 8000

-# ---- Helpers ----------------------------------------------------------------
-

 def cap_tool_content(text: str) -> str:
  if not text:
@@ -68,9 +64,6 @@ def _safe_domain(u: str) -> str:
    return u


-# ---- BrowserState ------------------------------------------------------------
-
-
 class BrowserState:
  def __init__(self, initial_state: Optional[BrowserStateData] = None):
    self._data = initial_state or BrowserStateData(view_tokens=DEFAULT_VIEW_TOKENS)
@@ -82,9 +75,6 @@ class BrowserState:
    self._data = data


-# ---- Browser ----------------------------------------------------------------
-
-
 class Browser:
  def __init__(
    self,
@@ -203,8 +193,6 @@ class Browser:

    return header + '\n'.join(body_lines)

-  # ---- page builders ----
-
  def _build_search_results_page_collection(self, query: str, results: Dict[str, Any]) -> Page:
    page = Page(
      url=f'search_results_{query}',
@@ -338,8 +326,6 @@ class Browser:
    find_page.lines = self._wrap_lines(find_page.text, 80)
    return find_page

-  # ---- public API: search / open / find ------------------------------------
-
  def search(self, *, query: str, topn: int = 5) -> Dict[str, Any]:
    if not self._client:
      raise RuntimeError('Client not provided')
@@ -1,116 +0,0 @@
-# /// script
-# requires-python = ">=3.11"
-# dependencies = [
-#   "mcp",
-#   "rich",
-#   "ollama",
-# ]
-# ///
-"""
-MCP stdio server exposing Ollama web_search and web_fetch as tools.
-
-Environment:
- OLLAMA_API_KEY (required): if set, will be used as Authorization header.
-"""
-
-from __future__ import annotations
-
-import asyncio
-from typing import Any, Dict
-
-from ollama import Client
-
-try:
-  # Preferred high-level API (if available)
-  from mcp.server.fastmcp import FastMCP  # type: ignore
-
-  _FASTMCP_AVAILABLE = True
-except Exception:
-  _FASTMCP_AVAILABLE = False
-
-if not _FASTMCP_AVAILABLE:
-  # Fallback to the low-level stdio server API
-  from mcp.server import Server  # type: ignore
-  from mcp.server.stdio import stdio_server  # type: ignore
-
-
-client = Client()
-
-
-def _web_search_impl(query: str, max_results: int = 3) -> Dict[str, Any]:
-  res = client.web_search(query=query, max_results=max_results)
-  return res.model_dump()
-
-
-def _web_fetch_impl(url: str) -> Dict[str, Any]:
-  res = client.web_fetch(url=url)
-  return res.model_dump()
-
-
-if _FASTMCP_AVAILABLE:
-  app = FastMCP('ollama-search-fetch')
-
-  @app.tool()
-  def web_search(query: str, max_results: int = 3) -> Dict[str, Any]:
-    """
-    Perform a web search using Ollama's hosted search API.
-
-    Args:
-      query: The search query to run.
-      max_results: Maximum results to return (default: 3).
-
-    Returns:
-      JSON-serializable dict matching ollama.WebSearchResponse.model_dump()
-    """
-
-    return _web_search_impl(query=query, max_results=max_results)
-
-  @app.tool()
-  def web_fetch(url: str) -> Dict[str, Any]:
-    """
-    Fetch the content of a web page for the provided URL.
-
-    Args:
-      url: The absolute URL to fetch.
-
-    Returns:
-      JSON-serializable dict matching ollama.WebFetchResponse.model_dump()
-    """
-
-    return _web_fetch_impl(url=url)
-
-  if __name__ == '__main__':
-    app.run()
-
-else:
-  server = Server('ollama-search-fetch')  # type: ignore[name-defined]
-
-  @server.tool()  # type: ignore[attr-defined]
-  async def web_search(query: str, max_results: int = 3) -> Dict[str, Any]:
-    """
-    Perform a web search using Ollama's hosted search API.
-
-    Args:
-      query: The search query to run.
-      max_results: Maximum results to return (default: 3).
-    """
-
-    return await asyncio.to_thread(_web_search_impl, query, max_results)
-
-  @server.tool()  # type: ignore[attr-defined]
-  async def web_fetch(url: str) -> Dict[str, Any]:
-    """
-    Fetch the content of a web page for the provided URL.
-
-    Args:
-      url: The absolute URL to fetch.
-    """
-
-    return await asyncio.to_thread(_web_fetch_impl, url)
-
-  async def _main() -> None:
-    async with stdio_server() as (read, write):  # type: ignore[name-defined]
-      await server.run(read, write)  # type: ignore[attr-defined]
-
-  if __name__ == '__main__':
-    asyncio.run(_main())
@@ -1,4 +1,3 @@
-import contextlib
 import ipaddress
 import json
 import os
@@ -76,7 +75,7 @@ from ollama._types import (
 T = TypeVar('T')


-class BaseClient(contextlib.AbstractContextManager, contextlib.AbstractAsyncContextManager):
+class BaseClient:
  def __init__(
    self,
    client,
@@ -117,12 +116,6 @@ class BaseClient(contextlib.AbstractContextManager, contextlib.AbstractAsyncCont
      **kwargs,
    )

-  def __exit__(self, exc_type, exc_val, exc_tb):
-    self.close()
-
-  async def __aexit__(self, exc_type, exc_val, exc_tb):
-    await self.close()
-

 CONNECTION_ERROR_MESSAGE = 'Failed to connect to Ollama. Please check that Ollama is downloaded, running and accessible. https://ollama.com/download'

@@ -131,9 +124,6 @@ class Client(BaseClient):
  def __init__(self, host: Optional[str] = None, **kwargs) -> None:
    super().__init__(httpx.Client, host, **kwargs)

-  def close(self):
-    self._client.close()
-
  def _request_raw(self, *args, **kwargs):
    try:
      r = self._client.request(*args, **kwargs)
@@ -210,16 +200,11 @@ class Client(BaseClient):
    context: Optional[Sequence[int]] = None,
    stream: Literal[False] = False,
    think: Optional[bool] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    raw: bool = False,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    images: Optional[Sequence[Union[str, bytes, Image]]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
-    width: Optional[int] = None,
-    height: Optional[int] = None,
-    steps: Optional[int] = None,
  ) -> GenerateResponse: ...

  @overload
@@ -234,16 +219,11 @@ class Client(BaseClient):
    context: Optional[Sequence[int]] = None,
    stream: Literal[True] = True,
    think: Optional[bool] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    raw: bool = False,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    images: Optional[Sequence[Union[str, bytes, Image]]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
-    width: Optional[int] = None,
-    height: Optional[int] = None,
-    steps: Optional[int] = None,
  ) -> Iterator[GenerateResponse]: ...

  def generate(
@@ -257,16 +237,11 @@ class Client(BaseClient):
    context: Optional[Sequence[int]] = None,
    stream: bool = False,
    think: Optional[bool] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    raw: Optional[bool] = None,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    images: Optional[Sequence[Union[str, bytes, Image]]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
-    width: Optional[int] = None,
-    height: Optional[int] = None,
-    steps: Optional[int] = None,
  ) -> Union[GenerateResponse, Iterator[GenerateResponse]]:
    """
    Create a response using the requested model.
@@ -291,16 +266,11 @@ class Client(BaseClient):
        context=context,
        stream=stream,
        think=think,
-        logprobs=logprobs,
-        top_logprobs=top_logprobs,
        raw=raw,
        format=format,
        images=list(_copy_images(images)) if images else None,
        options=options,
        keep_alive=keep_alive,
-        width=width,
-        height=height,
-        steps=steps,
      ).model_dump(exclude_none=True),
      stream=stream,
    )
@@ -314,8 +284,6 @@ class Client(BaseClient):
    tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
    stream: Literal[False] = False,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
@@ -330,8 +298,6 @@ class Client(BaseClient):
    tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
    stream: Literal[True] = True,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
@@ -345,8 +311,6 @@ class Client(BaseClient):
    tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
    stream: bool = False,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
@@ -394,8 +358,6 @@ class Client(BaseClient):
        tools=list(_copy_tools(tools)),
        stream=stream,
        think=think,
-        logprobs=logprobs,
-        top_logprobs=top_logprobs,
        format=format,
        options=options,
        keep_alive=keep_alive,
@@ -724,9 +686,6 @@ class AsyncClient(BaseClient):
  def __init__(self, host: Optional[str] = None, **kwargs) -> None:
    super().__init__(httpx.AsyncClient, host, **kwargs)

-  async def close(self):
-    await self._client.aclose()
-
  async def _request_raw(self, *args, **kwargs):
    try:
      r = await self._client.request(*args, **kwargs)
@@ -843,16 +802,11 @@ class AsyncClient(BaseClient):
    context: Optional[Sequence[int]] = None,
    stream: Literal[False] = False,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    raw: bool = False,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    images: Optional[Sequence[Union[str, bytes, Image]]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
-    width: Optional[int] = None,
-    height: Optional[int] = None,
-    steps: Optional[int] = None,
  ) -> GenerateResponse: ...

  @overload
@@ -867,16 +821,11 @@ class AsyncClient(BaseClient):
    context: Optional[Sequence[int]] = None,
    stream: Literal[True] = True,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    raw: bool = False,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    images: Optional[Sequence[Union[str, bytes, Image]]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
-    width: Optional[int] = None,
-    height: Optional[int] = None,
-    steps: Optional[int] = None,
  ) -> AsyncIterator[GenerateResponse]: ...

  async def generate(
@@ -890,16 +839,11 @@ class AsyncClient(BaseClient):
    context: Optional[Sequence[int]] = None,
    stream: bool = False,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    raw: Optional[bool] = None,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    images: Optional[Sequence[Union[str, bytes, Image]]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
-    width: Optional[int] = None,
-    height: Optional[int] = None,
-    steps: Optional[int] = None,
  ) -> Union[GenerateResponse, AsyncIterator[GenerateResponse]]:
    """
    Create a response using the requested model.
@@ -923,16 +867,11 @@ class AsyncClient(BaseClient):
        context=context,
        stream=stream,
        think=think,
-        logprobs=logprobs,
-        top_logprobs=top_logprobs,
        raw=raw,
        format=format,
        images=list(_copy_images(images)) if images else None,
        options=options,
        keep_alive=keep_alive,
-        width=width,
-        height=height,
-        steps=steps,
      ).model_dump(exclude_none=True),
      stream=stream,
    )
@@ -946,8 +885,6 @@ class AsyncClient(BaseClient):
    tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
    stream: Literal[False] = False,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
@@ -962,8 +899,6 @@ class AsyncClient(BaseClient):
    tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
    stream: Literal[True] = True,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
@@ -977,8 +912,6 @@ class AsyncClient(BaseClient):
    tools: Optional[Sequence[Union[Mapping[str, Any], Tool, Callable]]] = None,
    stream: bool = False,
    think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None,
-    logprobs: Optional[bool] = None,
-    top_logprobs: Optional[int] = None,
    format: Optional[Union[Literal['', 'json'], JsonSchemaValue]] = None,
    options: Optional[Union[Mapping[str, Any], Options]] = None,
    keep_alive: Optional[Union[float, str]] = None,
@@ -1027,8 +960,6 @@ class AsyncClient(BaseClient):
        tools=list(_copy_tools(tools)),
        stream=stream,
        think=think,
-        logprobs=logprobs,
-        top_logprobs=top_logprobs,
        format=format,
        options=options,
        keep_alive=keep_alive,
@@ -210,22 +210,6 @@ class GenerateRequest(BaseGenerateRequest):
  think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None
  'Enable thinking mode (for thinking models).'

-  logprobs: Optional[bool] = None
-  'Return log probabilities for generated tokens.'
-
-  top_logprobs: Optional[int] = None
-  'Number of alternative tokens and log probabilities to include per position (0-20).'
-
-  # Experimental image generation parameters
-  width: Optional[int] = None
-  'Width of the generated image in pixels (for image generation models).'
-
-  height: Optional[int] = None
-  'Height of the generated image in pixels (for image generation models).'
-
-  steps: Optional[int] = None
-  'Number of diffusion steps (for image generation models).'
-

 class BaseGenerateResponse(SubscriptableBaseModel):
  model: Optional[str] = None
@@ -259,25 +243,12 @@ class BaseGenerateResponse(SubscriptableBaseModel):
  'Duration of evaluating inference in nanoseconds.'


-class TokenLogprob(SubscriptableBaseModel):
-  token: str
-  'Token text.'
-
-  logprob: float
-  'Log probability for the token.'
-
-
-class Logprob(TokenLogprob):
-  top_logprobs: Optional[Sequence[TokenLogprob]] = None
-  'Most likely tokens and their log probabilities.'
-
-
 class GenerateResponse(BaseGenerateResponse):
  """
  Response returned by generate requests.
  """

-  response: Optional[str] = None
+  response: str
  'Response content. When streaming, this contains a fragment of the response.'

  thinking: Optional[str] = None
@@ -286,20 +257,6 @@ class GenerateResponse(BaseGenerateResponse):
  context: Optional[Sequence[int]] = None
  'Tokenized history up to the point of the response.'

-  logprobs: Optional[Sequence[Logprob]] = None
-  'Log probabilities for generated tokens.'
-
-  # Image generation response fields
-  image: Optional[str] = None
-  'Base64-encoded generated image data (for image generation models).'
-
-  # Streaming progress fields (for image generation)
-  completed: Optional[int] = None
-  'Number of completed steps (for image generation streaming).'
-
-  total: Optional[int] = None
-  'Total number of steps (for image generation streaming).'
-

 class Message(SubscriptableBaseModel):
  """
@@ -403,12 +360,6 @@ class ChatRequest(BaseGenerateRequest):
  think: Optional[Union[bool, Literal['low', 'medium', 'high']]] = None
  'Enable thinking mode (for thinking models).'

-  logprobs: Optional[bool] = None
-  'Return log probabilities for generated tokens.'
-
-  top_logprobs: Optional[int] = None
-  'Number of alternative tokens and log probabilities to include per position (0-20).'
-

 class ChatResponse(BaseGenerateResponse):
  """
@@ -418,9 +369,6 @@ class ChatResponse(BaseGenerateResponse):
  message: Message
  'Response message.'

-  logprobs: Optional[Sequence[Logprob]] = None
-  'Log probabilities for generated tokens if requested.'
-

 class EmbedRequest(BaseRequest):
  input: Union[str, Sequence[str]]
@@ -37,7 +37,7 @@ dependencies = [ 'ruff>=0.9.1' ]
 config-path = 'none'

 [tool.ruff]
-line-length = 320 
+line-length = 999
 indent-width = 2

 [tool.ruff.format]
@@ -61,44 +61,6 @@ def test_client_chat(httpserver: HTTPServer):
  assert response['message']['content'] == "I don't know."


-def test_client_chat_with_logprobs(httpserver: HTTPServer):
-  httpserver.expect_ordered_request(
-    '/api/chat',
-    method='POST',
-    json={
-      'model': 'dummy',
-      'messages': [{'role': 'user', 'content': 'Hi'}],
-      'tools': [],
-      'stream': False,
-      'logprobs': True,
-      'top_logprobs': 3,
-    },
-  ).respond_with_json(
-    {
-      'model': 'dummy',
-      'message': {
-        'role': 'assistant',
-        'content': 'Hello',
-      },
-      'logprobs': [
-        {
-          'token': 'Hello',
-          'logprob': -0.1,
-          'top_logprobs': [
-            {'token': 'Hello', 'logprob': -0.1},
-            {'token': 'Hi', 'logprob': -1.0},
-          ],
-        }
-      ],
-    }
-  )
-
-  client = Client(httpserver.url_for('/'))
-  response = client.chat('dummy', messages=[{'role': 'user', 'content': 'Hi'}], logprobs=True, top_logprobs=3)
-  assert response['logprobs'][0]['token'] == 'Hello'
-  assert response['logprobs'][0]['top_logprobs'][1]['token'] == 'Hi'
-
-
 def test_client_chat_stream(httpserver: HTTPServer):
  def stream_handler(_: Request):
    def generate():
@@ -332,40 +294,6 @@ def test_client_generate(httpserver: HTTPServer):
  assert response['response'] == 'Because it is.'


-def test_client_generate_with_logprobs(httpserver: HTTPServer):
-  httpserver.expect_ordered_request(
-    '/api/generate',
-    method='POST',
-    json={
-      'model': 'dummy',
-      'prompt': 'Why',
-      'stream': False,
-      'logprobs': True,
-      'top_logprobs': 2,
-    },
-  ).respond_with_json(
-    {
-      'model': 'dummy',
-      'response': 'Hello',
-      'logprobs': [
-        {
-          'token': 'Hello',
-          'logprob': -0.2,
-          'top_logprobs': [
-            {'token': 'Hello', 'logprob': -0.2},
-            {'token': 'Hi', 'logprob': -1.5},
-          ],
-        }
-      ],
-    }
-  )
-
-  client = Client(httpserver.url_for('/'))
-  response = client.generate('dummy', 'Why', logprobs=True, top_logprobs=2)
-  assert response['logprobs'][0]['token'] == 'Hello'
-  assert response['logprobs'][0]['top_logprobs'][1]['token'] == 'Hi'
-
-
 def test_client_generate_with_image_type(httpserver: HTTPServer):
  httpserver.expect_ordered_request(
    '/api/generate',
@@ -568,115 +496,6 @@ async def test_async_client_generate_format_pydantic(httpserver: HTTPServer):
  assert response['response'] == '{"answer": "Because of Rayleigh scattering", "confidence": 0.95}'


-def test_client_generate_image(httpserver: HTTPServer):
-  httpserver.expect_ordered_request(
-    '/api/generate',
-    method='POST',
-    json={
-      'model': 'dummy-image',
-      'prompt': 'a sunset over mountains',
-      'stream': False,
-      'width': 1024,
-      'height': 768,
-      'steps': 20,
-    },
-  ).respond_with_json(
-    {
-      'model': 'dummy-image',
-      'image': PNG_BASE64,
-      'done': True,
-      'done_reason': 'stop',
-    }
-  )
-
-  client = Client(httpserver.url_for('/'))
-  response = client.generate('dummy-image', 'a sunset over mountains', width=1024, height=768, steps=20)
-  assert response['model'] == 'dummy-image'
-  assert response['image'] == PNG_BASE64
-  assert response['done'] is True
-
-
-def test_client_generate_image_stream(httpserver: HTTPServer):
-  def stream_handler(_: Request):
-    def generate():
-      # Progress updates
-      for i in range(1, 4):
-        yield (
-          json.dumps(
-            {
-              'model': 'dummy-image',
-              'completed': i,
-              'total': 3,
-              'done': False,
-            }
-          )
-          + '\n'
-        )
-      # Final response with image
-      yield (
-        json.dumps(
-          {
-            'model': 'dummy-image',
-            'image': PNG_BASE64,
-            'done': True,
-            'done_reason': 'stop',
-          }
-        )
-        + '\n'
-      )
-
-    return Response(generate())
-
-  httpserver.expect_ordered_request(
-    '/api/generate',
-    method='POST',
-    json={
-      'model': 'dummy-image',
-      'prompt': 'a sunset over mountains',
-      'stream': True,
-      'width': 512,
-      'height': 512,
-    },
-  ).respond_with_handler(stream_handler)
-
-  client = Client(httpserver.url_for('/'))
-  response = client.generate('dummy-image', 'a sunset over mountains', stream=True, width=512, height=512)
-
-  parts = list(response)
-  # Check progress updates
-  assert parts[0]['completed'] == 1
-  assert parts[0]['total'] == 3
-  assert parts[0]['done'] is False
-  # Check final response
-  assert parts[-1]['image'] == PNG_BASE64
-  assert parts[-1]['done'] is True
-
-
-async def test_async_client_generate_image(httpserver: HTTPServer):
-  httpserver.expect_ordered_request(
-    '/api/generate',
-    method='POST',
-    json={
-      'model': 'dummy-image',
-      'prompt': 'a robot painting',
-      'stream': False,
-      'width': 1024,
-      'height': 1024,
-    },
-  ).respond_with_json(
-    {
-      'model': 'dummy-image',
-      'image': PNG_BASE64,
-      'done': True,
-    }
-  )
-
-  client = AsyncClient(httpserver.url_for('/'))
-  response = await client.generate('dummy-image', 'a robot painting', width=1024, height=1024)
-  assert response['model'] == 'dummy-image'
-  assert response['image'] == PNG_BASE64
-
-
 def test_client_pull(httpserver: HTTPServer):
  httpserver.expect_ordered_request(
    '/api/pull',
@@ -1456,33 +1275,3 @@ def test_client_explicit_bearer_header_overrides_env(monkeypatch: pytest.MonkeyP
  client = Client(headers={'Authorization': 'Bearer explicit-token'})
  assert client._client.headers['authorization'] == 'Bearer explicit-token'
  client.web_search('override check')
-
-
-def test_client_close():
-  client = Client()
-  client.close()
-  assert client._client.is_closed
-
-
-@pytest.mark.anyio
-async def test_async_client_close():
-  client = AsyncClient()
-  await client.close()
-  assert client._client.is_closed
-
-
-def test_client_context_manager():
-  with Client() as client:
-    assert isinstance(client, Client)
-    assert not client._client.is_closed
-
-  assert client._client.is_closed
-
-
-@pytest.mark.anyio
-async def test_async_client_context_manager():
-  async with AsyncClient() as client:
-    assert isinstance(client, AsyncClient)
-    assert not client._client.is_closed
-
-  assert client._client.is_closed
Author	SHA1	Message	Date
nicole pardal	8dd1d7cb02	removed comments	2025-09-24 11:49:19 -07:00
nicole pardal	d3afa37b11	remove unused import	2025-09-24 11:44:55 -07:00
nicole pardal	cfbb0cef7b	formatting fix	2025-09-24 11:43:33 -07:00
nicole pardal	80279e95ab	cleaned up code	2025-09-23 21:34:22 -07:00
nicole pardal	0ecfb1f6cf	api key fix added	2025-09-23 18:07:53 -07:00
nicole pardal	404672570f	lint	2025-09-23 18:02:02 -07:00
nicole pardal	15ec61dbcb	lint	2025-09-23 17:58:08 -07:00
nicole pardal	799ae1f07c	can lint pls work	2025-09-23 17:49:59 -07:00
nicole pardal	ae333084b9	renamed + added functionality	2025-09-23 17:44:18 -07:00
nicole pardal	1c6afe4316	lint formatting	2025-09-23 15:46:50 -07:00
nicole pardal	b9d435fad5	fixed nits	2025-09-23 15:46:50 -07:00
nicole pardal	10955d52ee	lint fix hopefully	2025-09-23 15:46:50 -07:00
nicole pardal	67f19a33e2	fix for failing lint check	2025-09-23 15:46:50 -07:00
nicole pardal	4d83af13d8	Added python browser tool	2025-09-23 15:46:50 -07:00