graphrag/graphrag/query/question_gen/base.py
Nathan Evans ac8a7f5eef
Some checks failed
gh-pages / build (push) Has been cancelled
Python CI / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python CI / python-ci (ubuntu-latest, 3.11) (push) Has been cancelled
Python CI / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python CI / python-ci (windows-latest, 3.11) (push) Has been cancelled
Python Integration Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Integration Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python Notebook Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Notebook Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python Publish (pypi) / Upload release to PyPI (push) Has been cancelled
Python Smoke Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Smoke Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Spellcheck / spellcheck (push) Has been cancelled
Housekeeping (#2086)
* Add deprecation warnings for fnllm and multi-search

* Fix dangling token_encoder refs

* Fix local_search notebook

* Fix global search dynamic notebook

* Fix global search notebook

* Fix drift notebook

* Switch example notebooks to use LiteLLM config

* Properly annotate dev deps as a group

* Semver

* Remove --extra dev

* Remove llm_model variable

* Ignore ruff ASYNC240

* Add note about expected broken notebook in docs

* Fix custom vector store notebook

* Push tokenizer throughout
2025-10-07 16:21:24 -07:00

66 lines
1.8 KiB
Python

# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License
"""Base classes for generating questions based on previously asked questions and most recent context data."""
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Any
from graphrag.language_model.protocol.base import ChatModel
from graphrag.query.context_builder.builders import (
GlobalContextBuilder,
LocalContextBuilder,
)
from graphrag.tokenizer.get_tokenizer import get_tokenizer
from graphrag.tokenizer.tokenizer import Tokenizer
@dataclass
class QuestionResult:
"""A Structured Question Result."""
response: list[str]
context_data: str | dict[str, Any]
completion_time: float
llm_calls: int
prompt_tokens: int
class BaseQuestionGen(ABC):
"""The Base Question Gen implementation."""
def __init__(
self,
model: ChatModel,
context_builder: GlobalContextBuilder | LocalContextBuilder,
tokenizer: Tokenizer | None = None,
model_params: dict[str, Any] | None = None,
context_builder_params: dict[str, Any] | None = None,
):
self.model = model
self.context_builder = context_builder
self.tokenizer = tokenizer or get_tokenizer(model.config)
self.model_params = model_params or {}
self.context_builder_params = context_builder_params or {}
@abstractmethod
async def generate(
self,
question_history: list[str],
context_data: str | None,
question_count: int,
**kwargs,
) -> QuestionResult:
"""Generate questions."""
@abstractmethod
async def agenerate(
self,
question_history: list[str],
context_data: str | None,
question_count: int,
**kwargs,
) -> QuestionResult:
"""Generate questions asynchronously."""