mirror of
https://github.com/microsoft/graphrag.git
synced 2026-01-29 23:31:51 +08:00
Some checks failed
gh-pages / build (push) Has been cancelled
Python CI / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python CI / python-ci (ubuntu-latest, 3.11) (push) Has been cancelled
Python CI / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python CI / python-ci (windows-latest, 3.11) (push) Has been cancelled
Python Integration Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Integration Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python Notebook Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Notebook Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python Publish (pypi) / Upload release to PyPI (push) Has been cancelled
Python Smoke Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Smoke Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Spellcheck / spellcheck (push) Has been cancelled
* Initial plan * Refactor VectorStoreFactory to use registration functionality like StorageFactory Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * Fix linting issues in VectorStoreFactory refactoring Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * Remove backward compatibility support from VectorStoreFactory and StorageFactory Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * Run ruff check --fix and ruff format, add semversioner file Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * ruff formatting fixes * Fix pytest errors in storage factory tests by updating PipelineStorage interface implementation Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * ruff formatting fixes * update storage factory design * Refactor CacheFactory to use registration functionality like StorageFactory Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * revert copilot changes * fix copilot changes * update comments * Fix failing pytest compatibility for factory tests Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * update class instantiation issue * ruff fixes * fix pytest * add default value * ruff formatting changes * ruff fixes * revert minor changes * cleanup cache factory * Update CacheFactory tests to match consistent factory pattern Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * update pytest thresholds * adjust threshold levels * Add custom vector store implementation notebook Create comprehensive notebook demonstrating how to implement and register custom vector stores with GraphRAG as a plug-and-play framework. Includes: - Complete implementation of SimpleInMemoryVectorStore - Registration with VectorStoreFactory - Testing and validation examples - Configuration examples for GraphRAG settings - Advanced features and best practices - Production considerations checklist The notebook provides a complete walkthrough for developers to understand and implement their own vector store backends. Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> * remove sample notebook for now * update tests * fix cache pytests * add pandas-stub to dev dependencies * disable warning check for well known key * skip tests when running on ubuntu * add documentation for custom vector store implementations * ignore ruff findings in notebooks * fix merge breakages * speedup CLI import statements * remove unnecessary import statements in init file * Add str type option on storage/cache type * Fix store name * Add LoggerFactory * Fix up logging setup across CLI/API * Add LoggerFactory test * Fix err message * Semver * Remove enums from factory methods --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com> Co-authored-by: Josh Bradley <joshbradley@microsoft.com> Co-authored-by: Nathan Evans <github@talkswithnumbers.com>
75 lines
2.3 KiB
Python
75 lines
2.3 KiB
Python
# Copyright (c) 2024 Microsoft Corporation.
|
|
# Licensed under the MIT License
|
|
import asyncio
|
|
import os
|
|
import unittest
|
|
|
|
from graphrag.cache.json_pipeline_cache import JsonPipelineCache
|
|
from graphrag.storage.file_pipeline_storage import (
|
|
FilePipelineStorage,
|
|
)
|
|
|
|
TEMP_DIR = "./.tmp"
|
|
|
|
|
|
def create_cache():
|
|
storage = FilePipelineStorage(base_dir=os.path.join(os.getcwd(), ".tmp"))
|
|
return JsonPipelineCache(storage)
|
|
|
|
|
|
class TestFilePipelineCache(unittest.IsolatedAsyncioTestCase):
|
|
def setUp(self):
|
|
self.cache = create_cache()
|
|
|
|
def tearDown(self):
|
|
asyncio.run(self.cache.clear())
|
|
|
|
async def test_cache_clear(self):
|
|
# Create a cache directory
|
|
if not os.path.exists(TEMP_DIR):
|
|
os.mkdir(TEMP_DIR)
|
|
with open(f"{TEMP_DIR}/test1", "w") as f:
|
|
f.write("This is test1 file.")
|
|
with open(f"{TEMP_DIR}/test2", "w") as f:
|
|
f.write("This is test2 file.")
|
|
|
|
# this invokes cache.clear()
|
|
await self.cache.clear()
|
|
|
|
# Check if the cache directory is empty
|
|
files = os.listdir(TEMP_DIR)
|
|
assert len(files) == 0
|
|
|
|
async def test_child_cache(self):
|
|
await self.cache.set("test1", "test1")
|
|
assert os.path.exists(f"{TEMP_DIR}/test1")
|
|
|
|
child = self.cache.child("test")
|
|
assert os.path.exists(f"{TEMP_DIR}/test")
|
|
|
|
await child.set("test2", "test2")
|
|
assert os.path.exists(f"{TEMP_DIR}/test/test2")
|
|
|
|
await self.cache.set("test1", "test1")
|
|
await self.cache.delete("test1")
|
|
assert not os.path.exists(f"{TEMP_DIR}/test1")
|
|
|
|
async def test_cache_has(self):
|
|
test1 = "this is a test file"
|
|
await self.cache.set("test1", test1)
|
|
|
|
assert await self.cache.has("test1")
|
|
assert not await self.cache.has("NON_EXISTENT")
|
|
assert await self.cache.get("NON_EXISTENT") is None
|
|
|
|
async def test_get_set(self):
|
|
test1 = "this is a test file"
|
|
test2 = "\\n test"
|
|
test3 = "\\\\\\"
|
|
await self.cache.set("test1", test1)
|
|
await self.cache.set("test2", test2)
|
|
await self.cache.set("test3", test3)
|
|
assert await self.cache.get("test1") == test1
|
|
assert await self.cache.get("test2") == test2
|
|
assert await self.cache.get("test3") == test3
|