diff --git a/data/operation_dulce/dataset.zip b/data/operation_dulce/dataset.zip index c820d0bf..900d5d86 100644 Binary files a/data/operation_dulce/dataset.zip and b/data/operation_dulce/dataset.zip differ diff --git a/index.html b/index.html index 35cd5e42..abdbf685 100644 --- a/index.html +++ b/index.html @@ -301,7 +301,9 @@ Figure 1: An LLM-generated knowledge graph built using GPT-4 Turbo.

GraphRAG is a structured, hierarchical approach to Retrieval Augmented Generation (RAG), as opposed to naive semantic-search approaches using plain text snippets. The GraphRAG process involves extracting a knowledge graph out of raw text, building a community hierarchy, generating summaries for these communities, and then leveraging these structures when perform RAG-based tasks.

To learn more about GraphRAG and how it can be used to enhance your LLMs ability to reason about your private data, please visit the Microsoft Research Blog Post.

-

Get Started 🚀

+

Solution Accelerator 🚀

+

To quickstart the GraphRAG system we recommend trying the Solution Accelerator package. This provides a user-friendly end-to-end experience with Azure resources.

+

Get Started with GraphRAG 🚀

To start using GraphRAG, check out the Get Started guide. For a deeper dive into the main sub-systems, please visit the docpages for the Indexer and Query packages.

GraphRAG vs Baseline RAG 🔍

@@ -346,6 +348,8 @@ We strongly recommend to fine-tune your prompts following the | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/config/custom/index.html b/posts/config/custom/index.html index 4a67000e..dc58974b 100644 --- a/posts/config/custom/index.html +++ b/posts/config/custom/index.html @@ -485,6 +485,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/config/env_vars/index.html b/posts/config/env_vars/index.html index 0a5221aa..fc5f5d6d 100644 --- a/posts/config/env_vars/index.html +++ b/posts/config/env_vars/index.html @@ -1247,6 +1247,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/config/init/index.html b/posts/config/init/index.html index 131c4d89..930cc414 100644 --- a/posts/config/init/index.html +++ b/posts/config/init/index.html @@ -339,6 +339,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/config/json_yaml/index.html b/posts/config/json_yaml/index.html index 4df671b9..089188c4 100644 --- a/posts/config/json_yaml/index.html +++ b/posts/config/json_yaml/index.html @@ -492,6 +492,8 @@ API_KEY=some_api_key | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/config/overview/index.html b/posts/config/overview/index.html index 2c9a08f8..77c3a370 100644 --- a/posts/config/overview/index.html +++ b/posts/config/overview/index.html @@ -316,6 +316,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/config/template/index.html b/posts/config/template/index.html index 8ede6aef..9e861f23 100644 --- a/posts/config/template/index.html +++ b/posts/config/template/index.html @@ -475,6 +475,8 @@ the --root parameter on your Indexing Pipeline execution.

| GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/developing/index.html b/posts/developing/index.html index 203d9b55..e643c723 100644 --- a/posts/developing/index.html +++ b/posts/developing/index.html @@ -404,6 +404,8 @@ to reduce concurrency. Please refer to the Configur | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/get_started/index.html b/posts/get_started/index.html index 6c7bae81..98c2fbba 100644 --- a/posts/get_started/index.html +++ b/posts/get_started/index.html @@ -295,7 +295,9 @@ a {

👉 Use the GraphRAG Accelerator solution
👉 Install from pypi.
👉 Use it from source

-

Top-Level Packages

+

Quickstart

+

To get started with the GraphRAG system we recommend trying the Solution Accelerator package. This provides a user-friendly end-to-end experience with Azure resources.

+

Top-Level Modules

Indexing Pipeline Overview
Query Engine Overview

Overview

@@ -304,9 +306,9 @@ It shows how to use the system to index some text, and then use the indexed data

Install GraphRAG

-
pip install graphrag
+
pip install graphrag
-
@@ -315,18 +317,18 @@ It shows how to use the system to index some text, and then use the indexed data

First let's get a sample dataset ready:

-
mkdir -p ./ragtest/input
+
mkdir -p ./ragtest/input
-

Now let's get a copy of A Christmas Carol by Charles Dickens from a trusted source

-
curl https://www.gutenberg.org/cache/epub/24022/pg24022.txt > ./ragtest/input/book.txt
+
curl https://www.gutenberg.org/cache/epub/24022/pg24022.txt > ./ragtest/input/book.txt
-
@@ -337,9 +339,9 @@ It shows how to use the system to index some text, and then use the indexed data Since we have already configured a directory named .ragtest` in the previous step, we can run the following command:

-
python -m graphrag.index --init --root ./ragtest
+
python -m graphrag.index --init --root ./ragtest
-
@@ -356,12 +358,12 @@ Since we have already configured a directory named .ragtest` in the previous ste

In addition, Azure OpenAI users should set the following variables in the settings.yaml file. To find the appropriate sections, just search for the llm: configuration, you should see two sections, one for the chat endpoint and one for the embeddings endpoint. Here is an example of how to configure the chat endpoint:

-
type: azure_openai_chat # Or azure_openai_embedding for embeddings
+  
type: azure_openai_chat # Or azure_openai_embedding for embeddings
 api_base: https://<instance>.openai.azure.com
 api_version: 2024-02-15-preview # You can customize this for other versions
 deployment_name: <azure_model_deployment_name>
-
@@ -374,9 +376,9 @@ Since we have already configured a directory named .ragtest` in the previous ste

Finally we'll run the pipeline!

-
python -m graphrag.index --root ./ragtest
+
python -m graphrag.index --root ./ragtest
-
@@ -389,24 +391,24 @@ Once the pipeline is complete, you should see a new folder called ./ragtes

Here is an example using Global search to ask a high-level question:

-
python -m graphrag.query \
+  
python -m graphrag.query \
 --root ./ragtest \
 --method global \
 "What are the top themes in this story?"
-

Here is an example using Local search to ask a more specific question about a particular character:

-
python -m graphrag.query \
+  
python -m graphrag.query \
 --root ./ragtest \
 --method local \
 "Who is Scrooge, and what are his main relationships?"
-
@@ -428,6 +430,8 @@ Once the pipeline is complete, you should see a new folder called ./ragtes | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/index/0-architecture/index.html b/posts/index/0-architecture/index.html index 9ad97b71..12185b9d 100644 --- a/posts/index/0-architecture/index.html +++ b/posts/index/0-architecture/index.html @@ -336,6 +336,8 @@ This allows our indexer to be more resilient to network issues, to act idempoten | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/index/1-default_dataflow/index.html b/posts/index/1-default_dataflow/index.html index f88b7a69..7437a4b0 100644 --- a/posts/index/1-default_dataflow/index.html +++ b/posts/index/1-default_dataflow/index.html @@ -377,6 +377,8 @@ Entities and Relationships are extracted at once in our entity_extract | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/index/2-cli/index.html b/posts/index/2-cli/index.html index 2bce1683..bb0abafb 100644 --- a/posts/index/2-cli/index.html +++ b/posts/index/2-cli/index.html @@ -328,6 +328,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/index/overview/index.html b/posts/index/overview/index.html index 1cbda3d3..6ad236ca 100644 --- a/posts/index/overview/index.html +++ b/posts/index/overview/index.html @@ -381,6 +381,8 @@ pipeline_result = outputs | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/prompt_tuning/auto_prompt_tuning/index.html b/posts/prompt_tuning/auto_prompt_tuning/index.html index 8d8f46ea..5790cb01 100644 --- a/posts/prompt_tuning/auto_prompt_tuning/index.html +++ b/posts/prompt_tuning/auto_prompt_tuning/index.html @@ -386,6 +386,8 @@ After that, it uses one of the following selection methods to pick a sample to w | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/prompt_tuning/manual_prompt_tuning/index.html b/posts/prompt_tuning/manual_prompt_tuning/index.html index b6365d0d..20ff9833 100644 --- a/posts/prompt_tuning/manual_prompt_tuning/index.html +++ b/posts/prompt_tuning/manual_prompt_tuning/index.html @@ -346,6 +346,8 @@ The default value is

| GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/prompt_tuning/overview/index.html b/posts/prompt_tuning/overview/index.html index 59f70193..f6842c9f 100644 --- a/posts/prompt_tuning/overview/index.html +++ b/posts/prompt_tuning/overview/index.html @@ -293,7 +293,7 @@ a {

Default Prompts

The default prompts are the simplest way to get started with the GraphRAG system. It is designed to work out-of-the-box with minimal configuration. You can find more detail about these prompts in the following links:

    -
  • [Entity/Relationship Extraction] (http://github.com/microsoft/graphrag/blob/main/graphrag/index/graph/extractors/graph/prompts.py)
  • +
  • Entity/Relationship Extraction
  • Entity/Relationship Description Summarization
  • Claim Extraction
  • Community Reports
  • @@ -319,6 +319,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/query/0-global_search/index.html b/posts/query/0-global_search/index.html index 65c31971..dcfcb921 100644 --- a/posts/query/0-global_search/index.html +++ b/posts/query/0-global_search/index.html @@ -332,6 +332,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/query/1-local_search/index.html b/posts/query/1-local_search/index.html index 56f9f18c..39fa44e9 100644 --- a/posts/query/1-local_search/index.html +++ b/posts/query/1-local_search/index.html @@ -324,6 +324,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/query/2-question_generation/index.html b/posts/query/2-question_generation/index.html index a0bb7cce..219e64ff 100644 --- a/posts/query/2-question_generation/index.html +++ b/posts/query/2-question_generation/index.html @@ -322,6 +322,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/query/3-cli/index.html b/posts/query/3-cli/index.html index 877078f7..e382fae7 100644 --- a/posts/query/3-cli/index.html +++ b/posts/query/3-cli/index.html @@ -350,6 +350,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/query/notebooks/global_search_nb/index.html b/posts/query/notebooks/global_search_nb/index.html index 975255bb..65d68c25 100644 --- a/posts/query/notebooks/global_search_nb/index.html +++ b/posts/query/notebooks/global_search_nb/index.html @@ -487,6 +487,8 @@ result.context_data | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/query/notebooks/local_search_nb/index.html b/posts/query/notebooks/local_search_nb/index.html index 3f87ebb7..2d9fe95b 100644 --- a/posts/query/notebooks/local_search_nb/index.html +++ b/posts/query/notebooks/local_search_nb/index.html @@ -647,6 +647,8 @@ candidate_questions = | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/query/notebooks/overview/index.html b/posts/query/notebooks/overview/index.html index ace296db..28539fb1 100644 --- a/posts/query/notebooks/overview/index.html +++ b/posts/query/notebooks/overview/index.html @@ -312,6 +312,8 @@ a { | GitHub + | + Solution Accelerator \ No newline at end of file diff --git a/posts/query/overview/index.html b/posts/query/overview/index.html index 671b156b..b121f608 100644 --- a/posts/query/overview/index.html +++ b/posts/query/overview/index.html @@ -322,6 +322,8 @@ It is responsible for the following tasks:

    | GitHub + | + Solution Accelerator \ No newline at end of file