# GraphRAG Development # Requirements | Name | Installation | Purpose | | ------------------- | ------------------------------------------------------------ | ----------------------------------------------------------------------------------- | | Python 3.10 or 3.11 | [Download](https://www.python.org/downloads/) | The library is Python-based. | | uv | [Instructions](https://docs.astral.sh/uv/) | uv is used for package management and virtualenv management in Python codebases | # Getting Started ## Install Dependencies ```shell # install python dependencies uv sync ``` ## Execute the indexing engine ```shell uv run poe index <...args> ``` ## Execute prompt tuning ```shell uv run poe prompt_tune <...args> ``` ## Execute Queries ```shell uv run poe query <...args> ``` ## Repository Structure An overview of the repository's top-level folder structure is provided below, detailing the overall design and purpose. We leverage a factory design pattern where possible, enabling a variety of implementations for each core component of graphrag. ```shell graphrag ├── api # library API definitions ├── cache # cache module supporting several options │   └─ factory.py # └─ main entrypoint to create a cache ├── callbacks # a collection of commonly used callback functions ├── cli # library CLI │   └─ main.py # └─ primary CLI entrypoint ├── config # configuration management ├── index # indexing engine | └─ run/run.py # main entrypoint to build an index ├── logger # logger module supporting several options │   └─ factory.py # └─ main entrypoint to create a logger ├── model # data model definitions associated with the knowledge graph ├── prompt_tune # prompt tuning module ├── prompts # a collection of all the system prompts used by graphrag ├── query # query engine ├── storage # storage module supporting several options │   └─ factory.py # └─ main entrypoint to create/load a storage endpoint ├── utils # helper functions used throughout the library └── vector_stores # vector store module containing a few options └─ factory.py # └─ main entrypoint to create a vector store ``` Where appropriate, the factories expose a registration method for users to provide their own custom implementations if desired. ## Versioning We use [semversioner](https://github.com/raulgomis/semversioner) to automate and enforce semantic versioning in the release process. Our CI/CD pipeline checks that all PR's include a json file generated by semversioner. When submitting a PR, please run: ```shell uv run semversioner add-change -t patch -d "." ``` # Azurite Some unit and smoke tests use Azurite to emulate Azure resources. This can be started by running: ```sh ./scripts/start-azurite.sh ``` or by simply running `azurite` in the terminal if already installed globally. See the [Azurite documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azurite) for more information about how to install and use Azurite. # Lifecycle Scripts Our Python package utilizes uv to manage dependencies and [poethepoet](https://pypi.org/project/poethepoet/) to manage custom build scripts. Available scripts are: - `uv run poe index` - Run the Indexing CLI - `uv run poe query` - Run the Query CLI - `uv build` - This invokes `uv build`, which will build a wheel file and other distributable artifacts. - `uv run poe test` - This will execute all tests. - `uv run poe test_unit` - This will execute unit tests. - `uv run poe test_integration` - This will execute integration tests. - `uv run poe test_smoke` - This will execute smoke tests. - `uv run poe check` - This will perform a suite of static checks across the package, including: - formatting - documentation formatting - linting - security patterns - type-checking - `uv run poe fix` - This will apply any available auto-fixes to the package. Usually this is just formatting fixes. - `uv run poe fix_unsafe` - This will apply any available auto-fixes to the package, including those that may be unsafe. - `uv run poe format` - Explicitly run the formatter across the package. ## Troubleshooting ### "RuntimeError: llvm-config failed executing, please point LLVM_CONFIG to the path for llvm-config" when running uv sync Make sure llvm-9 and llvm-9-dev are installed: `sudo apt-get install llvm-9 llvm-9-dev` and then in your bashrc, add `export LLVM_CONFIG=/usr/bin/llvm-config-9` ### "numba/\_pymodule.h:6:10: fatal error: Python.h: No such file or directory" when running uv sync Make sure you have python3.10-dev installed or more generally `python-dev` `sudo apt-get install python3.10-dev`