mirror of
https://github.com/microsoft/graphrag.git
synced 2026-02-02 17:21:32 +08:00
|
Some checks are pending
Python Build and Type Check / python-ci (ubuntu-latest, 3.11) (push) Waiting to run
Python Build and Type Check / python-ci (ubuntu-latest, 3.13) (push) Waiting to run
Python Build and Type Check / python-ci (windows-latest, 3.11) (push) Waiting to run
Python Build and Type Check / python-ci (windows-latest, 3.13) (push) Waiting to run
Python Integration Tests / python-ci (ubuntu-latest, 3.13) (push) Waiting to run
Python Integration Tests / python-ci (windows-latest, 3.13) (push) Waiting to run
Python Notebook Tests / python-ci (ubuntu-latest, 3.13) (push) Waiting to run
Python Notebook Tests / python-ci (windows-latest, 3.13) (push) Waiting to run
Python Smoke Tests / python-ci (ubuntu-latest, 3.13) (push) Waiting to run
Python Smoke Tests / python-ci (windows-latest, 3.13) (push) Waiting to run
Python Unit Tests / python-ci (ubuntu-latest, 3.13) (push) Waiting to run
Python Unit Tests / python-ci (windows-latest, 3.13) (push) Waiting to run
* Update to python 3.14 as default, with range down to 3.10 * Fix enum value in query cli * Update pyarrow * Update py version for storage package * Remove 3.10 * add fastuuid * Update Python support to 3.11-3.14 with stricter dependency constraints - Set minimum Python version to 3.11 (removed 3.10 support) - Added support for Python 3.14 - Updated CI workflows: single-version jobs use 3.14, matrix jobs use 3.11 and 3.14 - Fixed license format to use SPDX-compatible format for Python 3.14 - Updated pyarrow to >=22.0.0 for Python 3.14 wheel support - Added explicit fastuuid~=0.14 and blis~=1.3 for Python 3.14 compatibility - Replaced all loose version constraints (>=) with compatible release (~=) for better lock file control - Applied stricter versioning to all packages: graphrag, graphrag-common, graphrag-storage, unified-search-app * update uv lock * Pin blis to ~=1.3.3 to ensure Python 3.14 wheel availability * Update uv lock * Update numpy to >=2.0.0 for Python 3.14 Windows compatibility Numpy 1.25.x has access violation issues on Python 3.14 Windows. Numpy 2.x has proper Python 3.14 support including Windows wheels. * update uv lock * Update pandas to >=2.3.0 for numpy 2.x compatibility Pandas 2.2.x was compiled against numpy 1.x and causes ABI incompatibility errors with numpy 2.x. Pandas 2.3.0+ supports numpy 2.x properly. * update uv.lock * Add scipy>=1.15.0 for numpy 2.x compatibility Scipy versions < 1.15.0 have C extensions built against numpy 1.x and are incompatible with numpy 2.x, causing dtype size errors. * update uv lock * Update Python support to 3.11-3.13 with compatible dependencies - Set Python version range to 3.11-3.13 (removed 3.14 support) - Updated CI workflows: single-version jobs use 3.13, matrix jobs use 3.11 and 3.13 - Dependencies optimized for Python 3.13 compatibility: - pyarrow~=22.0 (has Python 3.13 wheels) - numpy~=1.26 - pandas~=2.2 - blis~=1.0 - fastuuid~=0.13 - Applied stricter version constraints using ~= operator throughout - Updated uv.lock with resolved dependencies * Update numpy to 2.1+ and pandas to 2.3+ for Python 3.13 Windows compatibility Numpy 1.26.x causes access violations on Python 3.13 Windows. Numpy 2.1+ has proper Python 3.13 support with Windows wheels. Pandas 2.3+ is required for numpy 2.x compatibility. * update vsts.yml python version |
||
|---|---|---|
| .. | ||
| graphrag_storage | ||
| pyproject.toml | ||
| README.md | ||
GraphRAG Storage
Basic
import asyncio
from graphrag_storage import StorageConfig, create_storage, StorageType
async def run():
storage = create_storage(
StorageConfig(
type=StorageType.File
base_dir="output"
)
)
await storage.set("my_key", "value")
print(await storage.get("my_key"))
if __name__ == "__main__":
asyncio.run(run())
Custom Storage
import asyncio
from typing import Any
from graphrag_storage import Storage, StorageConfig, create_storage, register_storage
class MyStorage(Storage):
def __init__(self, some_setting: str, optional_setting: str = "default setting", **kwargs: Any):
# Validate settings and initialize
...
#Implement rest of interface
...
register_storage("MyStorage", MyStorage)
async def run():
storage = create_storage(
StorageConfig(
type="MyStorage"
some_setting="My Setting"
)
)
# Or use the factory directly to instantiate with a dict instead of using
# StorageConfig + create_factory
# from graphrag_storage.storage_factory import storage_factory
# storage = storage_factory.create(strategy="MyStorage", init_args={"some_setting": "My Setting"})
await storage.set("my_key", "value")
print(await storage.get("my_key"))
if __name__ == "__main__":
asyncio.run(run())
Details
By default, the create_storage comes with the following storage providers registered that correspond to the entries in the StorageType enum.
FileStorageAzureBlobStorageAzureCosmosStorageMemoryStorage
The preregistration happens dynamically, e.g., FileStorage is only imported and registered if you request a FileStorage with create_storage(StorageType.File, ...). There is no need to manually import and register builtin storage providers when using create_storage.
If you want a clean factory with no preregistered storage providers then directly import storage_factory and bypass using create_storage. The downside is that storage_factory.create uses a dict for init args instead of the strongly typed StorageConfig used with create_storage.
from graphrag_storage.storage_factory import storage_factory
from graphrag_storage.file_storage import FileStorage
# storage_factory has no preregistered providers so you must register any
# providers you plan on using.
# May also register a custom implementation, see above for example.
storage_factory.register("my_storage_key", FileStorage)
storage = storage_factory.create(strategy="my_storage_key", init_args={"base_dir": "...", "other_settings": "..."})
...