graphrag/docs/query/question_generation.md
Nathan Evans ae1f5e1811
Some checks failed
Python Build and Type Check / python-ci (ubuntu-latest, 3.11) (push) Has been cancelled
Python Build and Type Check / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled
Python Build and Type Check / python-ci (windows-latest, 3.11) (push) Has been cancelled
Python Build and Type Check / python-ci (windows-latest, 3.12) (push) Has been cancelled
Python Integration Tests / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled
Python Integration Tests / python-ci (windows-latest, 3.12) (push) Has been cancelled
Python Notebook Tests / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled
Python Notebook Tests / python-ci (windows-latest, 3.12) (push) Has been cancelled
Python Smoke Tests / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled
Python Smoke Tests / python-ci (windows-latest, 3.12) (push) Has been cancelled
Python Unit Tests / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled
Python Unit Tests / python-ci (windows-latest, 3.12) (push) Has been cancelled
Nov 2025 housekeeping (#2120)
* Remove gensim sideload

* Split CI build/type checks from unit tests

* Thorough review of docs to align with v3

* Format

* Fix version

* Fix type
2025-11-06 10:03:22 -08:00

24 lines
2.3 KiB
Markdown

# Question Generation ❔
## Entity-based Question Generation
The [question generation](https://github.com/microsoft/graphrag/blob/main//graphrag/query/question_gen/) method combines structured data from the knowledge graph with unstructured data from the input documents to generate candidate questions related to specific entities.
## Methodology
Given a list of prior user questions, the question generation method uses the same context-building approach employed in [local search](local_search.md) to extract and prioritize relevant structured and unstructured data, including entities, relationships, covariates, community reports and raw text chunks. These data records are then fitted into a single LLM prompt to generate candidate follow-up questions that represent the most important or urgent information content or themes in the data.
## Configuration
Below are the key parameters of the [Question Generation class](https://github.com/microsoft/graphrag/blob/main//graphrag/query/question_gen/local_gen.py):
* `model`: Language model chat completion object to be used for response generation
* `context_builder`: [context builder](https://github.com/microsoft/graphrag/blob/main//graphrag/query/structured_search/local_search/mixed_context.py) object to be used for preparing context data from collections of knowledge model objects, using the same context builder class as in local search
* `system_prompt`: prompt template used to generate candidate questions. Default template can be found at [system_prompt](https://github.com/microsoft/graphrag/blob/main//graphrag/prompts/query/question_gen_system_prompt.py)
* `llm_params`: a dictionary of additional parameters (e.g., temperature, max_tokens) to be passed to the LLM call
* `context_builder_params`: a dictionary of additional parameters to be passed to the [`context_builder`](https://github.com/microsoft/graphrag/blob/main//graphrag/query/structured_search/local_search/mixed_context.py) object when building context for the question generation prompt
* `callbacks`: optional callback functions, can be used to provide custom event handlers for LLM's completion streaming events
## How to Use
An example of the question generation function can be found in the following [notebook](../examples_notebooks/local_search.ipynb).