Skip to content

[BUG] RagTool not working as documented #3918

@wdobbels

Description

@wdobbels

Description

I've encountered 3 separate issues with RagTool. I'm grouping them together as they're related, but let me know if you'd rather have them split up. I'm using crewai 1.4.1 (with extras ["google-genai", "tools"]) with the following setup:

from crewai_tools import RagTool

rag_config = {
    "embedding_model": {
        "provider": "google-generativeai",
        "config": {
            "model": "textembedding-gecko",
            "project_id": "my-project-id",
            "location": "my-location",
        },
    },
}
rag_tool = RagTool(config=rag_config)

1. Still need to install qdrant_client even when using chromadb

The above code fails when qdrant_client is not installed, even though it is not used.

2. The documentation on adding new files is completely wrong

This page suggests adding a file as follows:

rag_tool.add(data_type="file", path="path/to/your/document.pdf")

However, after some experimentation it turns out that this just silently does nothing (as the "args" are empty). Instead the correct way to add this file would be:

from crewai_tools.rag.data_types import DataType
rag_tool.add("path/to/your/document.pdf", data_type=DataType.PDF_FILE)

So in other words:

  • The path must be passed as an arg, not as a kwarg
  • The data_type is an enum, not a string
  • Must explicitly state that it is a pdf_file, the enum does not have a "file" option

3. The similarity_threshold and limit kwargs should be optional, but aren't

The tool provides defaults for these values (similarity_threshold = 0.6, limit = 5). However, when setting up the pydantic class to validate the tool input, these defaults are completely ignored and the LLM is required to pass the arguments.

As such, the first LLM call will always fail with a ValidationError, unless you explicitly instruct the LLM to pass the parameters.

Steps to Reproduce

Use the RagTool as documented here

Expected behavior

See description

Screenshots/Code snippets

See description

Operating System

macOS Sonoma

Python Version

3.10

crewAI Version

1.4.1

crewAI Tools Version

1.4.1 (extras tools)

Virtual Environment

Poetry

Evidence

Image

Possible Solution

  1. Add qdrant_client as explicit dependency or make sure it's only imported when qdrant is selected
  2. Update the documentation to how the "rag_tool.add" actually should be called
  3. Update the way the pydantic input validation class is created to accept defaults

Additional context

See description

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions