kanshan/graphrag

Fork 0

mirror of https://github.com/microsoft/graphrag.git synced 2026-01-14 09:07:20 +08:00

Nathan Evans ae1f5e1811

Python Build and Type Check / python-ci (ubuntu-latest, 3.11) (push) Has been cancelled

Details

Python Build and Type Check / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled

Details

Python Build and Type Check / python-ci (windows-latest, 3.11) (push) Has been cancelled

Details

Python Build and Type Check / python-ci (windows-latest, 3.12) (push) Has been cancelled

Details

Python Integration Tests / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled

Details

Python Integration Tests / python-ci (windows-latest, 3.12) (push) Has been cancelled

Details

Python Notebook Tests / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled

Details

Python Notebook Tests / python-ci (windows-latest, 3.12) (push) Has been cancelled

Details

Python Smoke Tests / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled

Details

Python Smoke Tests / python-ci (windows-latest, 3.12) (push) Has been cancelled

Details

Python Unit Tests / python-ci (ubuntu-latest, 3.12) (push) Has been cancelled

Details

Python Unit Tests / python-ci (windows-latest, 3.12) (push) Has been cancelled

Details

Nov 2025 housekeeping (#2120 )

* Remove gensim sideload

* Split CI build/type checks from unit tests

* Thorough review of docs to align with v3

* Format

* Fix version

* Fix type

2025-11-06 10:03:22 -08:00

1.7 KiB

Raw Blame History

GraphRAG Indexing 🤖

The GraphRAG indexing package is a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using LLMs.

Indexing Pipelines are configurable. They are composed of workflows, standard and custom steps, prompt templates, and input/output adapters. Our standard pipeline is designed to:

extract entities, relationships and claims from raw text
perform community detection in entities
generate community summaries and reports at multiple levels of granularity
embed text into a vector space

The outputs of the pipeline are stored as Parquet tables by default, and embeddings are written to your configured vector store.

Getting Started

Requirements

See the requirements section in Get Started for details on setting up a development environment.

To configure GraphRAG, see the configuration documentation. After you have a config file you can run the pipeline using the CLI or the Python API.

Usage

CLI

uv run poe index --root <data_root> # default config mode

Python API

Please see the indexing API python file for the recommended method to call directly from Python code.

1.7 KiB Raw Blame History

GraphRAG Indexing 🤖

Getting Started

Requirements

Usage

CLI

Python API

Further Reading

1.7 KiB

Raw Blame History