Skip to content

feat: add Qdrant Edge storage backend for memory system#5076

Merged
greysonlalonde merged 11 commits into
mainfrom
gl/feat/qdrant-edge-storage-backend
Mar 25, 2026
Merged

feat: add Qdrant Edge storage backend for memory system#5076
greysonlalonde merged 11 commits into
mainfrom
gl/feat/qdrant-edge-storage-backend

Conversation

@greysonlalonde
Copy link
Copy Markdown
Contributor

@greysonlalonde greysonlalonde commented Mar 25, 2026

Summary

  • Add QdrantEdgeStorage as a pluggable StorageBackend implementation backed by Qdrant Edge (qdrant-edge-py)
  • Uses a write-local/sync-central pattern for safe multi-process access — each worker writes to a PID-keyed local shard, reads merge both shards, and local records flush to central on close
  • Support storage="qdrant-edge" string in Memory for easy opt-in
  • Add qdrant-edge-py as an optional dependency under [qdrant-edge]
  • 19 tests covering CRUD, filtering, dual-shard search, sync lifecycle, orphaned shard cleanup, and Memory integration

Note

Medium Risk
Introduces a new persistence backend and changes Memory.close() behavior to invoke backend close(), which can affect durability and shutdown semantics across processes.

Overview
Adds an optional QdrantEdgeStorage backend (via qdrant-edge-py) for unified memory, using a write-local/sync-central shard layout to support multi-process writes and merged reads, with orphaned worker shard recovery.

Memory now accepts storage="qdrant-edge" and calls storage.close() on shutdown to flush pending local data; the PR also introduces a new qdrant-edge optional dependency and comprehensive tests covering CRUD, filtering, lifecycle flush, dual-shard reads, and orphan cleanup.

Written by Cursor Bugbot for commit fe4a2cc. This will update automatically on new commits. Configure here.

Copy link
Copy Markdown
Contributor

@iris-clawd iris-clawd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Qdrant Edge Storage Backend

Well-structured PR with good test coverage (19 tests covering CRUD, filtering, dual-shard behavior, sync lifecycle, orphaned shard cleanup, and Memory integration). The write-local/sync-central pattern is a solid approach for multi-process safety.

Observations & Suggestions

1. Cross-process lock not implemented (medium concern)
The module docstring and class docstring both state that local records flush to central "under a cross-process lock," but no locking mechanism exists in flush_to_central() or close(). If two worker processes flush simultaneously, upserts to the central shard could race. Consider adding a file lock (e.g., fcntl.flock or filelock library) around the central shard write path, or update the documentation to clarify the actual concurrency guarantees.

2. datetime.utcnow() is deprecated (minor)
datetime.utcnow() is deprecated as of Python 3.12. Used in _payload_to_record (fallback) and touch_records. Consider replacing with datetime.now(datetime.timezone.utc).

3. Inefficient count() when local shard has data
When _local_has_data is True, count() falls back to len(self.list_records(limit=50_000)), which scrolls and deserializes all records. Could instead use CountRequest on both shards and deduplicate by record_id, or at minimum scroll with with_payload=False.

4. Async methods are synchronous wrappers
asave, asearch, and adelete call their sync counterparts directly, which will block the event loop. Consider wrapping with asyncio.to_thread() — though this may be consistent with other storage backends in the codebase.

5. PID recycling in orphan cleanup
_cleanup_orphaned_shards uses os.kill(pid, 0) to detect dead processes, but PIDs can be recycled by the OS. A long-lived shard from a previous run could match a completely different process. The risk is low given typical PID spaces but worth noting for robustness — a secondary check (e.g., a timestamp or boot-id marker in the shard) could help.

6. Point ID collision space
_uuid_to_point_id maps UUIDs to int % (2**63 - 1). Collision probability is negligible for typical workloads but exists in theory. The fallback path for non-UUID strings (int.from_bytes(uuid_str.encode()[:8]...)) has a much higher collision risk since it only uses the first 8 bytes.

What looks good

  • Clean separation of local/central shards with proper deduplication in reads
  • _build_scope_ancestors for efficient scope prefix matching via payload indexes
  • hasattr(self._storage, "close") check in Memory.close() is non-breaking
  • Proper atexit registration with idempotent close
  • Comprehensive test suite with realistic multi-shard scenarios
  • Optional dependency — zero impact on existing users

Approving — the core design is sound and well-tested. The lock gap (#1) is the main item to address before this sees heavy multi-process use.

Comment thread lib/crewai/tests/memory/test_qdrant_edge_storage.py Fixed
Comment thread lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py Fixed
Comment thread lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py Outdated
Comment thread lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py
Comment thread lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py
Copy link
Copy Markdown
Contributor

@iris-clawd iris-clawd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving. New commit addresses all review feedback: removed misleading cross-process lock docs, replaced deprecated datetime.utcnow() with datetime.now(UTC), improved count() to use scroll+dedup instead of full deserialization, and wrapped async methods with asyncio.to_thread. Clean fixes.

Comment thread lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py Fixed
Copy link
Copy Markdown
Contributor

@iris-clawd iris-clawd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving after 3 new fixes: delete now intersects record_ids with complex filters, orphaned shard recovery infers vector dimensions to prevent data loss, and cleaned up unused import in tests. Good hardening.

Comment thread lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py Outdated
Comment thread lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py Outdated
Comment thread lib/crewai/tests/memory/test_qdrant_edge_storage.py Dismissed
Copy link
Copy Markdown
Contributor

@iris-clawd iris-clawd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving. New commits: lazy-import in tests for CI without qdrant-edge, prefer local shard over central when deduplicating search results, and try/finally to guarantee shard close on error. Good robustness improvements.

Copy link
Copy Markdown
Contributor

@iris-clawd iris-clawd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving. Latest commit fixes Python 3.10 compat — uses timezone.utc instead of UTC (added in 3.11).

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py Outdated
Copy link
Copy Markdown
Contributor

@iris-clawd iris-clawd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving. Latest: adds byteorder arg to int.from_bytes() for Python 3.10 compat (keyword-only default added in 3.11).

@greysonlalonde greysonlalonde merged commit 1cc251b into main Mar 25, 2026
48 checks passed
@greysonlalonde greysonlalonde deleted the gl/feat/qdrant-edge-storage-backend branch March 25, 2026 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants