diff --git a/data/operation_dulce/dataset.zip b/data/operation_dulce/dataset.zip
index c70bab82..50df500f 100644
Binary files a/data/operation_dulce/dataset.zip and b/data/operation_dulce/dataset.zip differ
diff --git a/index.html b/index.html
index 57f0fc47..a948d659 100644
--- a/index.html
+++ b/index.html
@@ -292,7 +292,7 @@ For a deeper dive into the main sub-systems, please visit the docpages for the <
To address this, the tech community is working to develop methods that extend and enhance RAG. Microsoft Research’s new approach, GraphRAG, uses LLMs to create a knowledge graph based on an input corpus. This graph, along with community summaries and graph machine learning outputs, are used to augment prompts at query time. GraphRAG shows substantial improvement in answering the two classes of questions described above, demonstrating intelligence or mastery that outperforms other approaches previously applied to private datasets.
The GraphRAG Process 🤖
-GraphRAG builds upon our prior research and tooling using graph machine learning. The basic steps of the GraphRAG process are as follows:
+GraphRAG builds upon our prior research and tooling using graph machine learning. The basic steps of the GraphRAG process are as follows:
Index
- Slice up an input corpus into a series of TextUnits, which act as analyzable units for the rest of the process, and provide fine-grained references ino our outputs.
diff --git a/posts/config/env_vars/index.html b/posts/config/env_vars/index.html
index 38b5b8f2..6f158629 100644
--- a/posts/config/env_vars/index.html
+++ b/posts/config/env_vars/index.html
@@ -301,8 +301,8 @@ a {
GRAPHRAG_API_KEY |
-Yes |
-The API key. (Note: `OPENAI_API_KEY is also used as a fallback) |
+Yes for OpenAI. Optional for AOAI |
+The API key. (Note: `OPENAI_API_KEY is also used as a fallback). If not defined when using AOAI, managed identity will be used. |
str |
None |
@@ -366,7 +366,7 @@ a {
GRAPHRAG_LLM_API_KEY |
Yes (uses fallback) |
-The API key. |
+The API key. If not defined when using AOAI, managed identity will be used. |
str |
None |
@@ -514,7 +514,7 @@ a {
GRAPHRAG_EMBEDDING_API_KEY |
Yes (uses fallback) |
-The API key to use for the embedding client. |
+The API key to use for the embedding client. If not defined when using AOAI, managed identity will be used. |
str |
None |
@@ -646,6 +646,8 @@ a {
+Input Settings
+These settings control the data input used by the pipeline. Any settings with a fallback will use the base LLM settings, if available.
Plaintext Input Data (GRAPHRAG_INPUT_TYPE=text)
@@ -702,7 +704,7 @@ a {
GRAPHRAG_INPUT_TIMESTAMP_FORMAT |
-The timestamp format to use when parsing timestamps in the timestamp column |
+The timestamp format to use when parsing timestamps in the timestamp column. |
str |
optional |
None |
@@ -736,6 +738,13 @@ a {
file |
+GRAPHRAG_INPUT_STORAGE_ACCOUNT_BLOB_URL |
+The Azure Storage blob endpoint to use when in blob mode and using managed identity. Will have the format https://<storage_account_name>.blob.core.windows.net |
+str |
+optional |
+None |
+
+
GRAPHRAG_INPUT_CONNECTION_STRING |
The connection string to use when reading CSV input files from Azure Blob Storage. |
str |
@@ -933,6 +942,13 @@ a {
file |
+GRAPHRAG_STORAGE_STORAGE_ACCOUNT_BLOB_URL |
+The Azure Storage blob endpoint to use when in blob mode and using managed identity. Will have the format https://<storage_account_name>.blob.core.windows.net |
+str |
+optional |
+None |
+
+
GRAPHRAG_STORAGE_CONNECTION_STRING |
The Azure Storage connection string to use when in blob mode. |
str |
@@ -976,6 +992,13 @@ a {
file |
+GRAPHRAG_CACHE_STORAGE_ACCOUNT_BLOB_URL |
+The Azure Storage blob endpoint to use when in blob mode and using managed identity. Will have the format https://<storage_account_name>.blob.core.windows.net |
+str |
+optional |
+None |
+
+
GRAPHRAG_CACHE_CONNECTION_STRING |
The Azure Storage connection string to use when in blob mode. |
str |
@@ -1019,6 +1042,13 @@ a {
file |
+GRAPHRAG_REPORTING_STORAGE_ACCOUNT_BLOB_URL |
+The Azure Storage blob endpoint to use when in blob mode and using managed identity. Will have the format https://<storage_account_name>.blob.core.windows.net |
+str |
+optional |
+None |
+
+
GRAPHRAG_REPORTING_CONNECTION_STRING |
The Azure Storage connection string to use when in blob mode. |
str |
diff --git a/posts/config/json_yaml/index.html b/posts/config/json_yaml/index.html
index f27e3779..18908c40 100644
--- a/posts/config/json_yaml/index.html
+++ b/posts/config/json_yaml/index.html
@@ -299,6 +299,7 @@ API_KEY=some_api_key
connection_string str - (blob only) The Azure Storage connection string.
container_name str - (blob only) The Azure Storage container name.
base_dir str - The base directory to read input from, relative to the root.
+storage_account_blob_url str - The storage account blob URL to use.
llm
This is the base LLM configuration section. Other steps may override this configuration with their own LLM configuration.
@@ -357,6 +358,7 @@ API_KEY=some_api_key
connection_string str - (blob only) The Azure Storage connection string.
container_name str - (blob only) The Azure Storage container name.
base_dir str - The base directory to write cache to, relative to the root.
+storage_account_blob_url str - The storage account blob URL to use.
storage
Fields
@@ -365,6 +367,7 @@ API_KEY=some_api_key
connection_string str - (blob only) The Azure Storage connection string.
container_name str - (blob only) The Azure Storage container name.
base_dir str - The base directory to write reports to, relative to the root.
+storage_account_blob_url str - The storage account blob URL to use.
reporting
Fields
@@ -373,6 +376,7 @@ API_KEY=some_api_key
connection_string str - (blob only) The Azure Storage connection string.
container_name str - (blob only) The Azure Storage container name.
base_dir str - The base directory to write reports to, relative to the root.
+storage_account_blob_url str - The storage account blob URL to use.
entity_extraction
Fields
diff --git a/posts/developing/index.html b/posts/developing/index.html
index e9c32560..e855fd9f 100644
--- a/posts/developing/index.html
+++ b/posts/developing/index.html
@@ -281,7 +281,7 @@ a {
-| Python 3.10 or 3.11 |
+Python 3.10-3.12 |
Download |
The library is Python-based. |
diff --git a/posts/get_started/index.html b/posts/get_started/index.html
index 505e9768..0947af28 100644
--- a/posts/get_started/index.html
+++ b/posts/get_started/index.html
@@ -271,7 +271,7 @@ a {
Get Started
Requirements
-Python 3.10 or 3.11
+Python 3.10-3.12
To get started with the GraphRAG system, you have a few options:
👉 Install from pypi.
👉 Use it from source
@@ -338,7 +338,7 @@ It shows how to use the system to index some text, and then use the indexed data
export GRAPHRAG_API_BASE="https://<domain>.openai.azure.com" && \
export GRAPHRAG_API_VERSION="2024-02-15-preview" && \
-export GRAPHRAG_LLM_API_TYPE = "azure_openai_chat" && \
+export GRAPHRAG_LLM_TYPE = "azure_openai_chat" && \
export GRAPHRAG_LLM_DEPLOYMENT_NAME="<chat_completions_deployment_name>" && \
export GRAPHRAG_EMBEDDING_API_TYPE = "azure_openai_embedding" && \
export GRAPHRAG_EMBEDDING_DEPLOYMENT_NAME="<embeddings_deployment_name>"