Commit Graph

39 Commits

Author SHA1 Message Date
Nathan Evans
ac8a7f5eef
Housekeeping (#2086)
Some checks failed
gh-pages / build (push) Has been cancelled
Python CI / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python CI / python-ci (ubuntu-latest, 3.11) (push) Has been cancelled
Python CI / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python CI / python-ci (windows-latest, 3.11) (push) Has been cancelled
Python Integration Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Integration Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python Notebook Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Notebook Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Python Publish (pypi) / Upload release to PyPI (push) Has been cancelled
Python Smoke Tests / python-ci (ubuntu-latest, 3.10) (push) Has been cancelled
Python Smoke Tests / python-ci (windows-latest, 3.10) (push) Has been cancelled
Spellcheck / spellcheck (push) Has been cancelled
* Add deprecation warnings for fnllm and multi-search

* Fix dangling token_encoder refs

* Fix local_search notebook

* Fix global search dynamic notebook

* Fix global search notebook

* Fix drift notebook

* Switch example notebooks to use LiteLLM config

* Properly annotate dev deps as a group

* Semver

* Remove --extra dev

* Remove llm_model variable

* Ignore ruff ASYNC240

* Add note about expected broken notebook in docs

* Fix custom vector store notebook

* Push tokenizer throughout
2025-10-07 16:21:24 -07:00
Nathan Evans
2bd3922d8d
Litellm auth fix (#2083)
* Fix scope for Azure auth with LiteLLM

* Change internal language on max_attempts to max_retries

* Rework model config connectivity validation

* Semver

* Swtich smoke tests to LiteLLM

* Take out temporary retry_strategy = none since it is not fnllm compatible

* Bump smoke test timeout

* Bump smoke timeout further

* Tune smoke params

* Update smoke test bounds

* Remove covariates from min-csv smoke

* Smoke: adjust communities, remove drift

* Remove secrets where they aren't necessary

* Clean out old env var references
2025-10-06 10:54:21 -07:00
Copilot
7c28c70d5c
Switch from Poetry to uv for package management (#2008)
Some checks are pending
gh-pages / build (push) Waiting to run
Python CI / python-ci (ubuntu-latest, 3.10) (push) Waiting to run
Python CI / python-ci (ubuntu-latest, 3.11) (push) Waiting to run
Python CI / python-ci (windows-latest, 3.10) (push) Waiting to run
Python CI / python-ci (windows-latest, 3.11) (push) Waiting to run
Python Integration Tests / python-ci (ubuntu-latest, 3.10) (push) Waiting to run
Python Integration Tests / python-ci (windows-latest, 3.10) (push) Waiting to run
Python Notebook Tests / python-ci (ubuntu-latest, 3.10) (push) Waiting to run
Python Notebook Tests / python-ci (windows-latest, 3.10) (push) Waiting to run
Python Publish (pypi) / Upload release to PyPI (push) Waiting to run
Python Smoke Tests / python-ci (ubuntu-latest, 3.10) (push) Waiting to run
Python Smoke Tests / python-ci (windows-latest, 3.10) (push) Waiting to run
Spellcheck / spellcheck (push) Waiting to run
* Initial plan

* Switch from Poetry to uv for package management

Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com>

* Clean up build artifacts and update gitignore

Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com>

* remove build artifacts

* remove hardcoded version string

* fix calls to pip in cicd

* Update gh-pages.yml workflow to use uv instead of Poetry

Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com>

* ruff formatting fixes

* update cicd workflow with latest uv action

* fix command to retrieve package version

* update development instructions

* remove Poetry references

* Replace deprecated azuright action with npm-based Azurite installation

Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com>

* skip api version check for azurite

* add semversioner file

* update more changes from switching to UV

* Migrate unified-search-app from Poetry to uv package management

Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com>

* minor typo update

* minor Dockerfile update

* update cicd thresholds

* update pytest thresholds

* ruff fixes

* ruff fixes

* remove legacy npm settings that no longer apply

* Update Unified Search App Readme

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: jgbradley1 <654554+jgbradley1@users.noreply.github.com>
Co-authored-by: Josh Bradley <joshbradley@microsoft.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2025-08-13 18:57:25 -06:00
Nathan Evans
813b4de99f
Fix API key reference for gh-pages (#1821) 2025-03-18 11:10:11 -07:00
Nathan Evans
e40476153d
Speed up smoke tests (#1736)
* Move verb tests to regular CI

* Clean up env vars

* Update smoke runtime expectations

* Rework artifact assertions

* Fix plural in name

* remove redundant artifact len check

* Remove redundant artifact len check

* Adjust graph output expectations

* Update community expectations

* Include all workflow output

* Adjust text unit expectations

* Adjust assertions per dataset

* Fix test config param name

* Update nan allowed for optional model fields

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2025-02-25 13:24:35 -08:00
Nathan Evans
c02ab0984a
Streamline workflows (#1674)
* Remove create_final_nodes

* Rename final entity output to "entities"

* Remove duplicate code from graph extraction

* Rename create_final_relationships output to "relationships"

* Rename create_final_communities output to "communities"

* Combine compute_communities and create_final_communities

* Rename create_final_covariates output to "covariates"

* Rename create_final_community_reports output to "community_reports"

* Rename create_final_text_units output to "text_units"

* Rename create_final_documents output to "documents"

* Remove transient snapshots config

* Move create_final_entities to finalize_entities operation

* Move create_final_relationships flow to finalize_relationships operation

* Reuse some community report functions

* Collapse most of graph and text unit-based report generation

* Unify schemas files

* Move community reports extractor

* Move NLP report prompt to prompts folder

* Fix a few pandas warnings

* Rename embeddings config to embed_text

* Rename claim_extraction config to extract_claims

* Remove nltk from standard graph extraction

* Fix verb tests

* Fix extract graph config naming

* Fix moved file reference

* Create v1-to-v2 migration notebook

* Semver

* Fix smoke test artifact count

* Raise tpm/rpm on smoke tests

* Update drift settings for smoke tests

* Reuse project directory var in api notebook

* Format

* Format
2025-02-07 11:11:03 -08:00
KennyZhang1
8368b12532
Add Cosmos DB storage/cache option (#1431)
* added cosmosdb constructor and database methods

* added rest of abstract method headers

* added cosmos db container methods

* implemented has and delete methods

* finished implementing abstract class methods

* integrated class into storage factory

* integrated cosmosdb class into cache factory

* added support for new config file fields

* replaced primary key cosmosdb initialization with connection strings

* modified cosmosdb setter to require json

* Fix non-default emitters

* Format

* Ruff

* ruff

* first successful run of cosmosdb indexing

* removed extraneous container_name setting

* require base_dir to be typed as str

* reverted merged changed from closed branch

* removed nested try statement

* readded initial non-parquet emitter fix

* added basic support for parquet emitter using internal conversions

* merged with main and resolved conflicts

* fixed more merge conflicts

* added cosmosdb functionality to query pipeline

* tested query for cosmosdb

* collapsed cosmosdb schema to use minimal containers and databases

* simplified create_database and create_container functions

* ruff fixes and semversioner

* spellcheck and ci fixes

* updated pyproject toml and lock file

* apply fixes after merge from main

* add temporary comments

* refactor cache factory

* refactored storage factory

* minor formatting

* update dictionary

* fix spellcheck typo

* fix default value

* fix pydantic model defaults

* update pydantic models

* fix init_content

* cleanup how factory passes parameters to file storage

* remove unnecessary output file type

* update pydantic model

* cleanup code

* implemented clear method

* fix merge from main

* add test stub for cosmosdb

* regenerate lock file

* modified set method to collapse parquet rows

* modified get method to collapse parquet rows

* updated has and delete methods and docstrings to adhere to new schema

* added prefix helper function

* replaced delimiter for prefixed id

* verified empty tests are passing

* fix merges from main

* add find test

* update cicd step name

* tested querying for new schema

* resolved errors from merge conflicts

* refactored set method to handle cache in new schema

* refactored get method to handle cache in new schema

* force unique ids to be written to cosmos for nodes

* found bug with has and delete methods

* modified has and delete to work with cache in new schema

* fix the merge from main

* minor typo fixes

* update lock file

* spellcheck fix

* fix init function signature

* minor formatting updates

* remove https protocol

* change localhost to 127.0.0.1 address

* update pytest to use bacj engine

* verified cache tests

* improved speed of has function

* resolved pytest error with find function

* added test for child method

* make container_name variable private as _container_name

* minor variable name fix

* cleanup cosmos pytest and make the cosmosdb storage class operations more efficient

* update cicd to use different cosmosdb emulator

* test with http protocol

* added pytest for clear()

* add longer timeout for cosmosdb emulator startup

* revert http connection back to https

* add comments to cicd code for future dev usage

* set to container and database clients to none upon deletion

* ruff changes

* add comments to cicd code

* removed unneeded None statements and ruff fixes

* more ruff fixes

* Update test_run.py

* remove unnecessary call to delete container

* ruff format updates

* Reverted test_run.py

* fix ruff formatter errors

* cleanup variable names to be more consistent

* remove extra semversioner file

* revert pydantic model changes

* revert pydantic model change

* revert pydantic model change

* re-enable inline formatting rule

* update documentation in dev guide

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
Co-authored-by: Josh Bradley <joshbradley@microsoft.com>
2024-12-19 13:43:21 -06:00
Josh Bradley
0394b55086
Update CI/CD - skip running unit tests on documentation-only PRs (#1371) 2024-11-06 14:19:21 -05:00
Josh Bradley
3df6f8c65b
Allow ci/cd to skip draft PRs (#1314) 2024-10-23 12:46:00 -04:00
Andres Morales
fc9895f793
Replace current docs by mkdocs (#1263)
* Replace docs by mkdocs-material

* Fix markdown

* Fix verions in gh-pages workflow

* remove whitespaces

* add semver

* Add build docs check on python-ci

* Fix command in index cli

* Spellcheck

* Spellcheck

* remove docsite paths

* clear outputs from notebook

* remove dependabot npm for docsite

* remove more docsite left overs

* execute notebooks

* Update notebooks

* update poetry lock

* Remove notebook build from ci

* Revert dep update

* Navigation tabs

* Fix stylesheet

* add kwds to dictionary

* Turn on notebook execution

* Update gitignore

* Add MSR Blog posts

* spellcheck

* Accessibility Changes

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-10-11 13:39:03 -06:00
Alonso Guevara
0d348d6070
Remove unused cols from final entities (#1226)
* Remove unused cols from final entities

* Move verbs test to integ

* Move verbs test to integ

* Move to smoke tests
2024-09-27 17:10:52 -06:00
dependabot[bot]
b61c4ec737
Bump JamesIves/github-pages-deploy-action from 4.6.3 to 4.6.4 (#1104)
Bumps [JamesIves/github-pages-deploy-action](https://github.com/jamesives/github-pages-deploy-action) from 4.6.3 to 4.6.4.
- [Release notes](https://github.com/jamesives/github-pages-deploy-action/releases)
- [Commits](https://github.com/jamesives/github-pages-deploy-action/compare/v4.6.3...v4.6.4)

---
updated-dependencies:
- dependency-name: JamesIves/github-pages-deploy-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-19 18:07:44 -06:00
Alonso Guevara
044516f538
Clean and organize run index code (#1090)
* Create entypoint for cli and api (#1067)

* Add cli and api entrypoints for update index

* Semver

* Update docs

* Run tests on feature branch main

* Better /main handling in tests

* Clean and organize run index code

* Ruff fix

* Pyright fix

* Format fixes

* Pyright fix

* Format

* Fix integ tests

* Fix ruff

* Reorganize and clean up
2024-09-05 08:15:10 -06:00
Nathan Evans
f5b4d2fea5
Ci streamline (#988)
* Remove excess vars from gh-pages build

* Delete redundant javascript ci

* Pull apart testing CI

* Clean up integration tests build

* Move storage tests to integration CI

* Take py 3.10 out of smoke tests matrix

* Use minimum supported python version for most tests

* Re-run main CI on any test change

* Add Josh and Kenny to author list

* Update auto-resolve perms
2024-08-21 15:16:15 -06:00
Nathan Evans
98cabba38b
Notebook tests (#978)
* Fix notebook test runs

* Delete old issue template

* Add notebook CI action

* Print temp directories

* Print more env

* Move printing up

* Use runner_temp

* Try using current directory

* Try TMP env

* Re-write TMP

* Wrong yml

* Fix echo

* Only export if windows

* More logging

* Move export

* Reformat env write

* Fix braces

* Switch to in-memory execution

* Downgrade action perms

* Unused import
2024-08-20 17:19:37 -06:00
Derek Worthen
6b4de3d841
Index API (#953)
* Initial Index API

- Implement main API entry point: build_index
- Rely on GraphRagConfig instead of PipelineConfig
    - This unifies the API signature with the
    promt_tune and query API entry points
- Derive cache settings, config, and resuming from
    the config and other arguments to
    simplify/reduce arguments to build_index
- Add preflight config file validations
- Add semver change

* fix smoke tests

* fix smoke tests

* Use asyncio

* Add e2e artifacts in GH actions

* Remove unnecessary E2E test, and add skip_validations flag to cli

* Nicer imports

* Reorganize API functions.

* Add license headers and module docstrings

* Fix ignored ruff rule

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-20 15:42:20 -06:00
Alonso Guevara
e4daf358b9
Fix gh-pages publishing (#976)
* Remove indexer run from gh-pages, and use a local zip to avoid running

* Semver
2024-08-19 16:30:55 -06:00
Nathan Evans
4040f02508
Update general_issue.yml (#956)
Copy checklist from bug/feature to general
2024-08-16 13:26:24 -07:00
Nathan Evans
bd5be7bb1a
Update issues-autoresolve.yml (#955)
Add write permissions for actions so it can update the cache
2024-08-16 13:17:23 -07:00
Alonso Guevara
d68e323193
Disable fail fast on tests (#911) 2024-08-13 12:20:14 -06:00
Alonso Guevara
c451aa0093
Update smoke tests (#861)
* Run smoke tests on 4o

* Shorten dulce for smoke tests

* Update secrets for consistency
2024-08-08 13:07:44 -06:00
Dayenne Souza
1e10bd342e
Re-enable smoke tests (#848)
* add smoke tests again

* add smoke tests separated action

* add patch version

* disable blob test

* blob conn again

* add file as cache type

* remove cache type enterely

* increase timeout

* remove comment

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-07 12:23:46 -06:00
Nathan Evans
4c181a5390
Update issues-autoresolve.yml (#780)
Switch the logic to only look for awaiting_response label
2024-07-30 11:35:55 -06:00
dependabot[bot]
da5ed4baf4
Bump actions/stale from 5 to 9 (#759)
Bumps [actions/stale](https://github.com/actions/stale) from 5 to 9.
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/stale/compare/v5...v9)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-29 14:34:13 -06:00
Chris Trevino
9d99f323ea
Add encoding model to entity/claim extraction config sections (#740)
* Add encoding-model configuration to entity & claim extraction

* add change note

* pr updates

* test fix

* disable GH-based smoke tests
2024-07-26 15:05:08 -07:00
Nathan Evans
971e7d91f5
Update issue templates for more explicit guidance (#738) 2024-07-26 14:52:14 -04:00
Alonso Guevara
61b5eea347
Fix version numbering on publication (#701) 2024-07-24 22:02:52 -06:00
Alonso Guevara
ac6f240e29
Add autoresolve and update publish workflows (#698) 2024-07-24 19:54:11 -06:00
Alonso Guevara
2a95e6771c
Update issue templates (#675)
* Update issue templates

* Quick fix

* Quick fix

* Template fix

* Remove required
2024-07-24 14:01:44 -06:00
Josh Bradley
2ddee65c29
Read/write files as binary utf-8 (#639) 2024-07-24 13:28:22 -04:00
dependabot[bot]
5f34d2d153
Bump JamesIves/github-pages-deploy-action from 4.6.1 to 4.6.3 (#425)
Bumps [JamesIves/github-pages-deploy-action](https://github.com/jamesives/github-pages-deploy-action) from 4.6.1 to 4.6.3.
- [Release notes](https://github.com/jamesives/github-pages-deploy-action/releases)
- [Commits](https://github.com/jamesives/github-pages-deploy-action/compare/v4.6.1...v4.6.3)

---
updated-dependencies:
- dependency-name: JamesIves/github-pages-deploy-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-09 11:55:17 -06:00
Alonso Guevara
dd113422d4
Add issue templates (#448) 2024-07-08 17:34:18 -06:00
Alonso Guevara
a81efe58c7
Fix docsite base url (#319)
* Fix docsite base url

* Semversioner
2024-07-01 23:40:54 +00:00
Alonso Guevara
c556fbb795
Fix versioning (#318) 2024-07-01 16:55:22 -06:00
Alonso Guevara
7f88f1eb58
Fix publishing (#317) 2024-07-01 16:41:11 -06:00
Alonso Guevara
cc492327e6
Fix publishing (#316) 2024-07-01 16:35:42 -06:00
Alonso Guevara
1aaa8c49fb
Fix publishing (#315) 2024-07-01 16:28:32 -06:00
Alonso Guevara
7879bfb795
Fix publishing (#314)
* Fix publishing

* Cleanup
2024-07-01 16:20:35 -06:00
Alonso Guevara
81b81cf60b Initial Release 2024-07-01 15:25:30 -06:00