Skip to content

Adding optional dataset_name arg to index() method#138

Merged
QuentinJGMace merged 3 commits intoilluin-tech:mainfrom
gabrielspmoreira:index_dataset_name
Mar 25, 2026
Merged

Adding optional dataset_name arg to index() method#138
QuentinJGMace merged 3 commits intoilluin-tech:mainfrom
gabrielspmoreira:index_dataset_name

Conversation

@gabrielspmoreira
Copy link
Copy Markdown
Contributor

@gabrielspmoreira gabrielspmoreira commented Mar 17, 2026

This PR adds an optional dataset_name arg to the index() method of retrieval pipelines.
The dataset_name is useful to implement things like embedding caching, if you are interested for example on evaluating rerankers or agentic retrieval pipelines (like this AgenticRetrievalPipeline (doc)) without having to embed the whole corpus on index() method for every evaluation.

@gabrielspmoreira gabrielspmoreira marked this pull request as draft March 17, 2026 23:50
@gabrielspmoreira gabrielspmoreira marked this pull request as draft March 17, 2026 23:50
@gabrielspmoreira gabrielspmoreira changed the title Draft: Adding optional dataset_name arg to index() method Adding optional dataset_name arg to index() method Mar 17, 2026
@gabrielspmoreira gabrielspmoreira marked this pull request as ready for review March 19, 2026 22:52
Copy link
Copy Markdown
Collaborator

@QuentinJGMace QuentinJGMace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@QuentinJGMace QuentinJGMace merged commit a70f23a into illuin-tech:main Mar 25, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants