build: add .dockerignore and move docling from runtime to dev deps#9469
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughAdds a new .dockerignore with 12 ignore patterns and updates pyproject.toml to move the docling package from optional/runtime dependencies to dev-only dependencies. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
pyproject.toml (2)
127-129: Align docling/docling_core version expectations (avoid accidental incompat).You now ship
docling_core>=2.36.1at runtime whiledocling>=2.36.1is dev-only. If tests rely on features that require a newerdocling_corethan production gets, CI may pass while users hit mismatches.Two lightweight options:
- Pin compatible minor ranges explicitly (if upstream follows them), e.g.:
- "docling_core>=2.36.1", + "docling_core>=2.36.1,<3.0.0",
- Or add a comment next to both lines documenting the known compatible major/minor window to help future bumps.
Also applies to: 182-183
112-118: Duplicate packages in main and dev groups (scrapegraph-py, pydantic-ai, ruff).This predates the PR but worth cleaning up to avoid resolver surprises and drift:
- In main deps:
scrapegraph-py>=1.12.0,pydantic-ai>=0.0.19,ruff>=0.9.7.- In dev:
scrapegraph-py>=1.10.2,pydantic-ai>=0.0.19,ruff>=0.9.7,<0.10.Prefer keeping tools like ruff/dev-only libraries in the dev group only. Also ensure version ranges don’t diverge between groups. Example clean-up:
- "ruff>=0.9.7", + # ruff is dev-only below ... - "scrapegraph-py>=1.10.2", - "pydantic-ai>=0.0.19", + # duplicates removed; rely on main constraintsIf any of these are truly needed at runtime, remove them from dev to prevent conflicting constraints.
Also applies to: 174-176
.dockerignore (1)
1-12: Great start. Add dist and broader globs to shrink build context and avoid leaking env files.Given the frontend build output lives under
src/frontend/dist/(per our project conventions), add it to the ignore list. Also consider:
- Glob node_modules everywhere, not just the top-level frontend folder.
- Ignore .env variants like
.env.local,.env.production, etc.- Ignore common caches:
.mypy_cache,.ruff_cache,.tox,htmlcov,coverage.xml,.ipynb_checkpoints.- Optional: ignore Vite cache
src/frontend/.vite/.Apply something like:
src/frontend/node_modules src/frontend/build +src/frontend/dist @@ **/.DS_Store **/__pycache__ **/*.pyc **/.pytest_cache +**/.mypy_cache +**/.ruff_cache +**/.tox +htmlcov +coverage.xml **/.venv -**/.env +**/.env +**/.env.* +**/.secrets* +**/node_modules +**/venv +**/env +**/.direnv +src/frontend/.vite +**/.ipynb_checkpointsThis reduces Docker context size and prevents accidental inclusion of secrets.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
.dockerignore(1 hunks)pyproject.toml(1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📚 Learning: 2025-07-18T18:27:12.609Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/frontend_development.mdc:0-0
Timestamp: 2025-07-18T18:27:12.609Z
Learning: Applies to src/frontend/src/**/*.{ts,tsx,js,jsx} : All frontend TypeScript and JavaScript code should be located under src/frontend/src/ and organized into components, pages, icons, stores, types, utils, hooks, services, and assets directories as per the specified directory layout.
Applied to files:
.dockerignore
📚 Learning: 2025-07-21T14:16:14.125Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-07-21T14:16:14.125Z
Learning: Applies to src/frontend/**/*.@(test|spec).{ts,tsx,js,jsx} : Frontend test files should be located in 'src/frontend/' and use '.test.{ts,tsx,js,jsx}' or '.spec.{ts,tsx,js,jsx}' extensions.
Applied to files:
.dockerignore
📚 Learning: 2025-07-18T18:27:12.609Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/frontend_development.mdc:0-0
Timestamp: 2025-07-18T18:27:12.609Z
Learning: Frontend builds should output static files to src/frontend/dist/ for production deployment.
Applied to files:
.dockerignore
📚 Learning: 2025-07-18T18:27:12.609Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/frontend_development.mdc:0-0
Timestamp: 2025-07-18T18:27:12.609Z
Learning: Applies to src/frontend/@(package*.json|tsconfig.json|tailwind.config.*|vite.config.*) : Frontend configuration files such as package.json, tsconfig.json, and Tailwind/Vite configs must be maintained and updated as needed.
Applied to files:
.dockerignore
📚 Learning: 2025-07-18T18:27:12.609Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/frontend_development.mdc:0-0
Timestamp: 2025-07-18T18:27:12.609Z
Learning: Applies to src/frontend/src/**/__tests__/**/*.{ts,tsx,js,jsx} : All frontend code should be tested using appropriate component and integration tests.
Applied to files:
.dockerignore
📚 Learning: 2025-07-21T14:16:14.125Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-07-21T14:16:14.125Z
Learning: Applies to src/frontend/**/*.@(test|spec).{ts,tsx,js,jsx} : Test error handling and edge cases in frontend test files.
Applied to files:
.dockerignore
📚 Learning: 2025-07-21T14:16:14.125Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-07-21T14:16:14.125Z
Learning: Applies to src/frontend/**/*.@(test|spec).{ts,tsx,js,jsx} : Validate input/output behavior and test component initialization and configuration in frontend test files.
Applied to files:
.dockerignore
📚 Learning: 2025-07-21T14:16:14.125Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-07-21T14:16:14.125Z
Learning: Applies to src/frontend/**/*.@(test|spec).{ts,tsx,js,jsx} : Test both sync and async code paths in frontend test files.
Applied to files:
.dockerignore
📚 Learning: 2025-07-21T14:16:14.125Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-07-21T14:16:14.125Z
Learning: Applies to src/frontend/**/*.@(test|spec).{ts,tsx,js,jsx} : Mock external dependencies appropriately in frontend test files to isolate unit tests from external services.
Applied to files:
.dockerignore
| "docling>=2.36.1" | ||
| ] |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Moving docling to dev-only: verify no runtime imports and no remaining extras references.
Looks fine, but this is a behavioral change. Please confirm:
- No production code imports docling at runtime; only tests/dev tools should.
- Docs, Dockerfiles, Makefiles, and CI don’t instruct users to install
langflow[docling]anymore (since the extra group is gone).
Run the following checks from repo root:
If runtime imports exist, consider reintroducing an optional extra docling and guarding imports with graceful fallbacks. I can draft that if needed.
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) Search for runtime imports of docling/docling_core:"
rg -nC2 -g '!**/tests/**' -g '!**/docs/**' -P '\b(import|from)\s+docling\b|\bfrom\s+docling\b' src || true
echo "2) Search for any extras usage or install docs mentioning [docling]:"
rg -nC2 -P '\[docling\]' || true
echo "3) Search for any explicit `pip install docling` or `uv add docling` in scripts/CI:"
rg -nC2 -P '(pip|uv)\s+(install|add).*\bdocling\b' || trueLength of output: 3399
The initial CI search was skipped due to backticks in the echo. Let’s re-run step 3 properly and also check Dockerfiles, Makefiles, and GitHub workflows for any lingering docling install references.
Let’s re-run step 3 properly against CI/Docker/Make scripts and also inspect the lazy‐import fallback in the docling integration:
🏁 Script executed:
#!/usr/bin/env bash
set -euo pipefail
echo "3) Search for docling install references in Dockerfiles, Makefiles, CI workflows, and scripts/:"
rg -nC2 -g 'Dockerfile*' -g 'Makefile*' -g '.github/workflows/**/*.yml' -g 'scripts/**/*' -P '(pip|uv)\s+(install|add).*\bdocling\b' || true
echo
echo "4) Inspect lazy‐import/__getattr__ and try/except in the docling component:"
rg -nC5 '__getattr__' -P src/backend/base/langflow/components/docling/__init__.py || true
rg -nC5 'from docling' -P src/backend/base/langflow/components/docling/__init__.py || trueLength of output: 1794
Update installation references for Docling now that the extra group is removed
The dev‐dependency move is fine—Docling is only imported lazily within its integration module—but we still have stale install instructions in both code and docs. Please update the following:
• In src/backend/base/langflow/components/docling/__init__.py (lines 111–115), the fallback message still reads:
“Install with uv pip install 'langflow[docling]' …”
Change this to install Docling directly, e.g.:
msg = (
"Docling is an optional dependency of Langflow. "
"Install with `uv pip install docling` or refer to the documentation."
)• In docs/docs/Integrations/Docling/integrations-docling.mdx (around line 21), replace
Install the Docling extra in Langflow OSS with `uv pip install 'langflow[docling]'`.with
Install Docling separately with `uv pip install docling`.• Verified no other pip install docling or langflow[docling] references remain in Dockerfiles, Makefiles, CI workflows, or scripts/ directories.
• Runtime imports of Docling occur only within the lazy-import logic of the docling integration module, as intended.
|
Codecov Report✅ All modified and coverable lines are covered by tests. ❌ Your project status has failed because the head coverage (3.80%) is below the target coverage (10.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #9469 +/- ##
==========================================
- Coverage 33.93% 33.92% -0.01%
==========================================
Files 1195 1195
Lines 55950 55950
Branches 5331 5331
==========================================
- Hits 18984 18979 -5
- Misses 36896 36901 +5
Partials 70 70
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
…9469) * feat: add .dockerignore to exclude build artifacts and environment files * feat: move docling to main dependencies and remove docling extra
…9469) * feat: add .dockerignore to exclude build artifacts and environment files * feat: move docling to main dependencies and remove docling extra



Summary by CodeRabbit