mirror of
https://github.com/microsoft/graphrag.git
synced 2026-02-15 23:44:47 +08:00
* New workflow to generate embeddings in a single workflow * New workflow to generate embeddings in a single workflow * version change * clean tests without any embeddings references * clean tests without any embeddings references * remove code * feedback implemented * changes in logic * feedback implemented * store in table bug fixed * smoke test for generate_text_embeddings workflow * smoke test fix * add generate_text_embeddings to the list of transient workflows * smoke tests * fix * ruff formatting updates * fix * smoke test fixed * smoke test fixed * fix lancedb import * smoke test fix * ignore sorting * smoke test fixed * smoke test fixed * check smoke test * smoke test fixed * change config for vector store * format fix * vector store changes * revert debug profile back to empty filepath * merge conflict solved * merge conflict solved * format fixed * format fixed * fix return dataframe * snapshot fix * format fix * embeddings param implemented * validation fixes * fix map * fix map * fix properties * config updates * smoke test fixed * settings change * Update collection config and rework back-compat * Repalce . with - for embedding store --------- Co-authored-by: Alonso Guevara <alonsog@microsoft.com> Co-authored-by: Josh Bradley <joshbradley@microsoft.com> Co-authored-by: Nathan Evans <github@talkswithnumbers.com>
30 lines
744 B
YAML
30 lines
744 B
YAML
claim_extraction:
|
|
enabled: true
|
|
|
|
embeddings:
|
|
vector_store:
|
|
type: "azure_ai_search"
|
|
url: ${AZURE_AI_SEARCH_URL_ENDPOINT}
|
|
api_key: ${AZURE_AI_SEARCH_API_KEY}
|
|
container_name: "simple_text_ci"
|
|
|
|
community_reports:
|
|
prompt: "prompts/community_report.txt"
|
|
max_length: 2000
|
|
max_input_length: 8000
|
|
|
|
|
|
storage:
|
|
type: file # or blob
|
|
base_dir: "output/${timestamp}/artifacts"
|
|
# connection_string: <azure_blob_storage_connection_string>
|
|
# container_name: <azure_blob_storage_container_name>
|
|
|
|
reporting:
|
|
type: file # or console, blob
|
|
base_dir: "output/${timestamp}/reports"
|
|
# connection_string: <azure_blob_storage_connection_string>
|
|
# container_name: <azure_blob_storage_container_name>
|
|
|
|
snapshots:
|
|
embeddings: True |