Skip to content

fix: Knowledge Filter Does Not Restrict Document Retrieval Scope#1245

Closed
Wallgau wants to merge 6 commits into
release-0.4.0from
fix-filter
Closed

fix: Knowledge Filter Does Not Restrict Document Retrieval Scope#1245
Wallgau wants to merge 6 commits into
release-0.4.0from
fix-filter

Conversation

@Wallgau
Copy link
Copy Markdown
Collaborator

@Wallgau Wallgau commented Mar 24, 2026

issue: #1130 (check there for before behavior)

Now:
Screenshot 2026-03-24 at 5 02 48 PM

Summary
Applies chat knowledge filters to the Langflow OpenSearch component’s raw_search path so behavior matches search_documents (filter clauses, optional limit / score_threshold, top-level knn scoping). The merge logic lives entirely inside the component (and in exported flow JSON). No imports from OpenRAG src (e.g. no utils.opensearch_filter_merge), so the Langflow image does not need COPY src or PYTHONPATH for this feature.

Motivation
Custom Langflow components run in the Langflow process with imports resolved like normal Python. OpenRAG’s utils package is not part of the Langflow dependency set. Importing it implies putting OpenRAG src on the image and bloating the Langflow Dockerfile. Review feedback: keep helper logic in the component source until a proper upstream Langflow story exists.

What changed
flows/components/opensearch_multimodal.py: Inlined helpers (coerce_filter_clauses_from_filter_obj, merge_filter_clauses_into_search_body, apply_chat_filter_limits_to_body, apply_chat_filter_expression_to_search_body, etc.), raw_search updated to merge filter_expression into the request body with validation errors for bad JSON, _coerce_filter_clauses delegates to the shared module-level helper (includes connector_types → connector_type and empty terms handling).
flows/components/opensearch_filter_merge_standalone.py: Stdlib-only copy of the same logic for unit tests and as the reference to keep in sync with the inline block.
tests/unit/test_opensearch_filter_merge.py: TDD coverage for coerce / merge / limits / end-to-end behavior (37 tests).
flows/ingestion_flow.json, openrag_agent.json, openrag_url_mcp.json, openrag_nudges.json: Embedded component source synced from opensearch_multimodal.py.

Olfa Maslah and others added 6 commits March 20, 2026 10:57
Scope provider credentials and MCP global vars to the selected embedding provider, prune stale provider headers, and retry URL ingestion once after targeted stale-state reconciliation errors.

Made-with: Cursor
Include all configured provider credentials in Langflow global vars and add tests for single-provider and multi-provider ingestion scenarios to prevent regressions when documents are embedded with different models.

Made-with: Cursor
This reverts commit 06ea496, reversing
changes made to 9a19e64.
@github-actions github-actions Bot added backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) docker tests bug 🔴 Something isn't working. labels Mar 24, 2026
@Wallgau Wallgau changed the title fix: filter scope chat fix: Knowledge Filter Does Not Restrict Document Retrieval Scope Mar 24, 2026
@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Mar 24, 2026

Used by the Langflow OpenSearch ``raw_search`` path so document scope matches
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wallgau May I know why do we need this file if the change are already merged to component file ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverting changes from last PR and used for unit test but we can remove it since you mention having those changes in another PR.

@@ -0,0 +1,150 @@
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock, patch
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets remove these unit tests.
I am not sure if they are required, if possible lets add a new integration test in another follow up PR.

)

@staticmethod
def _should_reconcile_url_ingestion_error(error_text: str, selected_provider: str) -> bool:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the use of these functions ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_reconcile_url_ingestion_runtime_state(selected_provider)

When should_reconcile… is true, this runs once before one retry of the same URL ingest request. It tries to reset and realign Langflow with OpenRAG settings:

reset_langflow_flow("url_ingest") (if flows_service supports it) — reload/reset the URL ingest flow so stale graph state is cleared.
change_langflow_model_value(...) — push the current selected_provider, embedding_model, and force_embedding_update=True into the url_ingest flow so globals match the backend config (e.g. stop calling Ollama when the user chose OpenAI).
Both steps are best-effort (exceptions are logged, not fatal); then the code refreshes the flow id and re-POSTs the run. If the retry still fails, _raise_retry_provider_error turns the body into a clearer, provider-oriented error where applicable.


# Remove stale provider headers not present in the desired set.
# Keep non-provider headers (JWT/OWNER/etc.) untouched.
pruned_args: List[str] = []
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cant we just check if the keys are present or not and not pass if the values are none or empty ?

@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator

@Wallgau May I know if the changes in langflow mcp service and headers requried for fixing the filter changes?

@Wallgau
Copy link
Copy Markdown
Collaborator Author

Wallgau commented Mar 25, 2026

fixed in another PR

@Wallgau Wallgau closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) bug 🔴 Something isn't working. docker tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants