mirror of
https://github.com/microsoft/graphrag.git
synced 2026-01-14 09:07:20 +08:00
Some checks failed
gh-pages / build (push) Has been cancelled
Python CI / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python CI / python-ci (ubuntu-latest, 3.11) (push) Has been cancelled
Python CI / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python CI / python-ci (windows-latest, 3.11) (push) Has been cancelled
Python Integration Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Integration Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python Notebook Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Notebook Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python Publish (pypi) / Upload release to PyPI (push) Has been cancelled
Python Smoke Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Smoke Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Spellcheck / spellcheck (push) Has been cancelled
334 lines
13 KiB
Markdown
334 lines
13 KiB
Markdown
# Changelog
|
|
Note: version releases in the 0.x.y range may introduce breaking changes.
|
|
|
|
## 2.2.1
|
|
|
|
- patch: Fix Community Report prompt tuning response
|
|
- patch: Fix graph creation missing edge weights.
|
|
- patch: Update as workflows
|
|
|
|
## 2.2.0
|
|
|
|
- minor: Support OpenAI reasoning models.
|
|
- patch: Add option to snapshot raw extracted graph tables.
|
|
- patch: Added batching logic to the prompt tuning autoselection embeddings workflow
|
|
- patch: Align config classes and docs better.
|
|
- patch: Align embeddings table loading with configured fields.
|
|
- patch: Brings parity with our latest NLP extraction approaches.
|
|
- patch: Fix fnllm to 0.2.3
|
|
- patch: Fixes to basic search.
|
|
- patch: Update llm args for consistency.
|
|
- patch: add vector store integration tests
|
|
|
|
## 2.1.0
|
|
|
|
- minor: Add support for JSON input files.
|
|
- minor: Updated the prompt tunning client to support csv-metadata injection and updated output file types to match the new naming convention.
|
|
- patch: Add check for custom model types while config loading
|
|
- patch: Adds general-purpose pipeline run state object.
|
|
|
|
## 2.0.0
|
|
|
|
- major: Add children to communities to avoid re-compute.
|
|
- major: Reorganize and rename workflows and their outputs.
|
|
- major: Rework API to accept callbacks.
|
|
- minor: Add LMM Manager and Factory, to support provider registration
|
|
- minor: Add NLP graph extraction.
|
|
- minor: Add pipeline_start and pipeline_end callbacks.
|
|
- minor: Move embeddings snapshots to the workflow runner.
|
|
- minor: Remove config inheritance, hydration, and automatic env var overlays.
|
|
- minor: Rework the update output storage structure.
|
|
- patch: Add caching to NLP extractor.
|
|
- patch: Add vector store id reference to embeddings config.
|
|
- patch: Export NLP community reports prompt.
|
|
- patch: Fix DRIFT search on Azure AI Search.
|
|
- patch: Fix StopAsyncIteration catch.
|
|
- patch: Fix missing embeddings workflow in FastGraphRAG.
|
|
- patch: Fix proper use of n_depth for drift search
|
|
- patch: Fix report generation recursion.
|
|
- patch: Fix summarization over large datasets for inc indexing. Fix relationship summarization
|
|
- patch: Optimize data iteration by removing some iterrows from code
|
|
- patch: Patch json mode for community reports
|
|
- patch: Properly increment text unit IDs during updates.
|
|
- patch: Refactor config defaults from constants to type-safe, hierarchical dataclass.
|
|
- patch: Require explicit azure auth settings when using AOI.
|
|
- patch: Separates graph pruning for differential usage.
|
|
- patch: Tuck flow functions under their workflow modules.
|
|
- patch: Update fnllm. Remove unused libs.
|
|
- patch: Use ModelProvider for query module
|
|
- patch: Use shared schema for final outputs.
|
|
- patch: add dynamic retry logic.
|
|
- patch: add option to prepend metadata into chunks
|
|
- patch: cleanup query code duplication.
|
|
- patch: implemented multi-index querying for api layer
|
|
- patch: multi index query cli support
|
|
- patch: remove unused columns and change property document_attribute_columns to metadata
|
|
- patch: update multi-index query to support new workflows
|
|
|
|
## 1.2.0
|
|
|
|
- minor: Add Drift Reduce response and streaming endpoint
|
|
- minor: add cosmosdb vector store
|
|
- patch: Fix example notebooks
|
|
- patch: Set default rate limits.
|
|
- patch: unit tests for text_splitting
|
|
|
|
## 1.2.0
|
|
|
|
- patch: Basic Rag minor fix
|
|
|
|
## 1.1.1
|
|
|
|
- patch: Fix a bug on creating community hierarchy for dynamic search
|
|
- patch: Increase LOCAL_SEARCH_COMMUNITY_PROP to 15%
|
|
|
|
## 1.1.0
|
|
|
|
- minor: Make gleanings independent of encoding
|
|
- minor: Remove DataShaper (first steps).
|
|
- minor: Remove old pipeline runner.
|
|
- minor: new search implemented as a new option for the api
|
|
- patch: Fix gleanings loop check
|
|
- patch: Implement cosmosdb storage option for cache and output
|
|
- patch: Move extractor code to co-locate with operations.
|
|
- patch: Remove config input models.
|
|
- patch: Ruff update
|
|
- patch: Simplify and streamline internal config.
|
|
- patch: Simplify callbacks model.
|
|
- patch: Streamline flows.
|
|
- patch: fix instantiation of storage classes.
|
|
|
|
## 1.0.1
|
|
|
|
- patch: Fix encoding model config parsing
|
|
- patch: Fix exception on error callbacks
|
|
- patch: Manage llm instances inside a cached singleton. Check for empty dfs after entity/relationship extraction
|
|
- patch: Respect encoding_model option
|
|
|
|
## 1.0.0
|
|
|
|
- patch: Add Parent id to communities data model
|
|
- patch: Add migration notebook.
|
|
- patch: Create separate community workflow, collapse subflows.
|
|
- patch: Dependency Updates
|
|
- patch: cleanup and refactor factory classes.
|
|
|
|
## 0.9.0
|
|
|
|
- minor: Refactor graph creation.
|
|
- patch: Dependency updates
|
|
- patch: Fix Global Search with dynamic Community selection bug
|
|
- patch: Fix question gen.
|
|
- patch: Optimize Final Community Reports calculation and stabilize cache
|
|
- patch: miscellaneous code cleanup and minor changes for better alignment of style across the codebase.
|
|
- patch: replace llm package with fnllm
|
|
- patch: replaced md5 hash with sha256
|
|
- patch: replaced md5 hash with sha512
|
|
- patch: update API and add a demonstration notebook
|
|
|
|
## 0.5.0
|
|
|
|
- minor: Data model changes.
|
|
- patch: Add Parquet as part of the default emitters when not pressent
|
|
- patch: Centralized prompts and export all for easier injection.
|
|
- patch: Cleanup of artifact outputs/schemas.
|
|
- patch: Config and docs updates.
|
|
- patch: Implement dynamic community selection to global search
|
|
- patch: fix autocompletion of existing files/directory paths.
|
|
- patch: move import statements out of init files
|
|
|
|
## 0.4.1
|
|
|
|
- patch: Add update cli entrypoint for incremental indexing
|
|
- patch: Allow some CI/CD jobs to skip PRs dedicated to doc updates only.
|
|
- patch: Fix a file paths issue in the viz guide.
|
|
- patch: Fix optional covariates update in incremental indexing
|
|
- patch: Raise error on empty deltas for inc indexing
|
|
- patch: add visualization guide to doc site
|
|
- patch: fix streaming output error
|
|
|
|
## 0.4.0
|
|
|
|
- minor: Add Incremental Indexing
|
|
- minor: Added DRIFT graph reasoning query module
|
|
- minor: embeddings moved to a different workflow
|
|
- patch: Add DRIFT search cli and example notebook
|
|
- patch: Add config for incremental updates
|
|
- patch: Add embeddings to subflow.
|
|
- patch: Add naive community merge using time period
|
|
- patch: Add relationship merge
|
|
- patch: Add runtime-only storage option.
|
|
- patch: Add text units update
|
|
- patch: Allow empty workflow returns to avoid disk writing.
|
|
- patch: Apply pandas optimizations to create final entities
|
|
- patch: Calculate new inputs and deleted inputs on update
|
|
- patch: Collapse covariates flow.
|
|
- patch: Collapse create-base-entity-graph.
|
|
- patch: Collapse create-final-community-reports.
|
|
- patch: Collapse create-final-documents.
|
|
- patch: Collapse create-final-entities.
|
|
- patch: Collapse create-final-nodes.
|
|
- patch: Collapse create_base_documents.
|
|
- patch: Collapse create_base_text_units.
|
|
- patch: Collapse create_final_relationships.
|
|
- patch: Collapse entity extraction.
|
|
- patch: Collapse entity summarize.
|
|
- patch: Collapse intermediate workflow outputs.
|
|
- patch: Dependency updates
|
|
- patch: Extract DataShaper-less flows.
|
|
- patch: Fix Community ID loading for DRIFT search over existing indexes
|
|
- patch: Fix embeddings faulty assignments
|
|
- patch: Fix init defaults for vector store and drift img in docs
|
|
- patch: Fix nested json parsing
|
|
- patch: Fix some edge cases on Drift Search over small input sets
|
|
- patch: Fix var name for embedding
|
|
- patch: Merge existing and new entities, updating values accordingly
|
|
- patch: Merge text_embed into create-final-relationships subflow.
|
|
- patch: Move embedding verbs to operations.
|
|
- patch: Moving verbs around.
|
|
- patch: Optimize Create Base Documents subflow
|
|
- patch: Optimize text unit relationship count
|
|
- patch: Perf optimizations in map_query_to_entities()
|
|
- patch: Remove aggregate_df from final coomunities and final text units
|
|
- patch: Remove duplicated relationships and nodes
|
|
- patch: Remove unused column from final entities
|
|
- patch: Reorganized api,reporter,callback code into separate components. Defined debug profiles.
|
|
- patch: Small cleanup in community context history building
|
|
- patch: Transient entity graph and snapshotting.
|
|
- patch: Update Incremental Indexing to new embeddings workflow
|
|
- patch: Use mkdocs for documentation
|
|
- patch: add backwards compatibility patch to vector store.
|
|
- patch: add-autogenerated-cli-docs
|
|
- patch: fix docs image path
|
|
- patch: refactor use of vector stores and update support for managed identity
|
|
- patch: remove redundant error-handling code from global-search
|
|
- patch: reorganize cli layer
|
|
|
|
## 0.3.6
|
|
|
|
- patch: Collapse create_final_relationships.
|
|
- patch: Dependency update and cleanup
|
|
|
|
## 0.3.5
|
|
|
|
- patch: Add compound verbs with tests infra.
|
|
- patch: Collapse create_final_communities.
|
|
- patch: Collapse create_final_text_units.
|
|
- patch: Covariate verb collapse.
|
|
- patch: Fix duplicates in community context builder
|
|
- patch: Fix prompt tune output path
|
|
- patch: Fix seed hardcoded init
|
|
- patch: Fix seeded random gen on clustering
|
|
- patch: Improve logging.
|
|
- patch: Set default values for cli parameters.
|
|
- patch: Use static output directories.
|
|
|
|
## 0.3.4
|
|
|
|
- patch: Deep copy txt units on local search to avoid race conditions
|
|
- patch: Fix summarization including empty descriptions
|
|
|
|
## 0.3.3
|
|
|
|
- patch: Add entrypoints for incremental indexing
|
|
- patch: Clean up and organize run index code
|
|
- patch: Consistent config loading. Resolves #99 and Resolves #1049
|
|
- patch: Fix circular dependency when running prompt tune api directly
|
|
- patch: Fix default settings for embedding
|
|
- patch: Fix img for auto tune
|
|
- patch: Fix img width
|
|
- patch: Fixed a bug in prompt tuning process
|
|
- patch: Refactor text unit build at local search
|
|
- patch: Update Prompt Tuning docs
|
|
- patch: Update create_pipeline_config.py
|
|
- patch: Update prompt tune command in docs
|
|
- patch: add querying from azure blob storage
|
|
- patch: fix setting base_dir to full paths when not using file system.
|
|
- patch: fix strategy config in entity_extraction
|
|
|
|
## 0.3.2
|
|
|
|
- patch: Add context data to query API responses.
|
|
- patch: Add missing config parameter documentation for prompt tuning
|
|
- patch: Add neo4j community notebook
|
|
- patch: Ensure entity types to be str when running prompt tuning
|
|
- patch: Fix weight casting during graph extraction
|
|
- patch: Patch "past" dependency issues
|
|
- patch: Update developer guide.
|
|
- patch: Update query type hints.
|
|
- patch: change-lancedb-placement
|
|
|
|
## 0.3.1
|
|
|
|
- patch: Add preflight check to check LLM connectivity.
|
|
- patch: Add streaming support for local/global search to query cli
|
|
- patch: Add support for both float and int on schema validation for community report generation
|
|
- patch: Avoid running index on gh-pages publishing
|
|
- patch: Implement Index API
|
|
- patch: Improves filtering for data dir inferring
|
|
- patch: Update to nltk 3.9.1
|
|
|
|
## 0.3.0
|
|
|
|
- minor: Implement auto templating API.
|
|
- minor: Implement query engine API.
|
|
- patch: Fix file dumps using json for non ASCII chars
|
|
- patch: Stabilize smoke tests for query context building
|
|
- patch: fix query embedding
|
|
- patch: fix sort_context & max_tokens params in verb
|
|
|
|
## 0.2.2
|
|
|
|
- patch: Add a check if there is no community record added in local search context
|
|
- patch: Add sepparate workflow for Python Tests
|
|
- patch: Docs updates
|
|
- patch: Run smoke tests on 4o
|
|
|
|
## 0.2.1
|
|
|
|
- patch: Added default columns for vector store at create_pipeline_config. No change for other cases.
|
|
- patch: Change json parsing error in the map step of global search to warning
|
|
- patch: Fix Local Search breaking when loading Embeddings input. Defaulting overwrite to True as in the rest of the vector store config
|
|
- patch: Fix json parsing when LLM returns faulty responses
|
|
- patch: Fix missing community reports and refactor community context builder
|
|
- patch: Fixed a bug that erased the vector database, added a new parameter to specify the config file path, and updated the documentation accordingly.
|
|
- patch: Try parsing json before even repairing
|
|
- patch: Update Prompt Tuning meta prompts with finer examples
|
|
- patch: Update default entity extraction and gleaning prompts to reduce hallucinations
|
|
- patch: add encoding-model to entity/claim extraction config
|
|
- patch: add encoding-model to text chunking config
|
|
- patch: add user prompt to history-tracking llm
|
|
- patch: update config reader to allow for zero gleans
|
|
- patch: update config-reader to allow for empty chunk-by arrays
|
|
- patch: update history-tracking LLm to use 'assistant' instead of 'system' in output history.
|
|
- patch: use history argument in hash key computation; add history input to cache data
|
|
|
|
## 0.2.0
|
|
|
|
- minor: Add content-based KNN for selecting prompt tune few shot examples
|
|
- minor: Add dynamic community report rating to the prompt tuning engine
|
|
- patch: Add Minute-based Rate Limiting and fix rpm, tpm settings
|
|
- patch: Add N parameter support
|
|
- patch: Add cli flag to overlay default values onto a provided config.
|
|
- patch: Add exception handling on file load
|
|
- patch: Add language support to prompt tuning
|
|
- patch: Add llm params to local and global search
|
|
- patch: Fix broken prompt tuning link on docs
|
|
- patch: Fix delta none on query calls
|
|
- patch: Fix docsite base url
|
|
- patch: Fix encoding model parameter on prompt tune
|
|
- patch: Fix for --limit exceeding the dataframe length
|
|
- patch: Fix for Ruff 0.5.2
|
|
- patch: Fixed an issue where base OpenAI embeddings can't work with Azure OpenAI LLM
|
|
- patch: Modify defaults for CHUNK_SIZE, CHUNK_OVERLAP and GLEANINGS to reduce time and LLM calls
|
|
- patch: fix community_report doesn't work in settings.yaml
|
|
- patch: fix llm response content is None in query
|
|
- patch: fix the organization parameter is ineffective during queries
|
|
- patch: remove duplicate file read
|
|
- patch: support non-open ai model config to prompt tune
|
|
- patch: use binary io processing for all file io operations
|
|
|
|
## 0.1.0
|
|
|
|
- minor: Initial Release
|