Skip to content

feat: Port Arxiv to the Extension Framework#13047

Merged
erichare merged 28 commits into
feat/extension-production-installfrom
feat/extension-pilot-arxiv
May 11, 2026
Merged

feat: Port Arxiv to the Extension Framework#13047
erichare merged 28 commits into
feat/extension-production-installfrom
feat/extension-pilot-arxiv

Conversation

@erichare
Copy link
Copy Markdown
Collaborator

@erichare erichare commented May 9, 2026

This pull request introduces the second pilot port of a component to the new Extension Bundle system by extracting the arXiv search component into its own standalone bundle, lfx-arxiv. This follows the documented porting process and ensures that the arXiv component is now shipped, installed, and managed as an independent extension, just like the previously ported DuckDuckGo component. The changes include all necessary workspace wiring, packaging, and documentation updates to support this migration.

Porting of arXiv component to standalone Extension Bundle:

  • Added the new lfx-arxiv bundle:
    • Created src/bundles/arxiv/ with full bundle layout, including README.md, pyproject.toml, and extension manifest (extension.json). [1] [2] [3] [4]
    • Moved the ArXivComponent implementation from the monorepo (src/lfx/src/lfx/components/arxiv/arxiv.py) to the new bundle location (src/bundles/arxiv/src/lfx_arxiv/components/arxiv/arxiv.py) without modifying its logic.

Workspace and dependency wiring:

  • Updated the root pyproject.toml to:
    • Add lfx-arxiv as a regular dependency so it is installed with Langflow.
    • Register lfx-arxiv under [tool.uv.sources] for workspace management.
    • Add src/bundles/arxiv to the workspace members list.

Documentation and process improvements:

  • Added a comprehensive porting guide at src/bundles/PORTING.md to document the step-by-step process for extracting components into bundles, including best practices, verification steps, and automation tips.

These changes ensure the arXiv component is now distributed, loaded, and maintained as a standalone extension, making future component ports easier and more consistent.

Validates ``src/bundles/PORTING.md`` end-to-end by following its recipe
against a clean candidate (``ArXivComponent``: no third-party runtime
deps, no langflow-base extra, no deactivated duplicate).  Touchpoints
exercised:

  * Bundle skeleton at ``src/bundles/arxiv/`` mirroring duckduckgo.
  * In-tree provider directory removed from ``src/lfx/src/lfx/components/``
    along with its three references in ``components/__init__.py``.
  * Workspace wiring: dep, ``[tool.uv.sources]``, ``[tool.uv.workspace]``
    members, lockfile.
  * Migration table: bare-name + two import-path forms + legacy_slot
    entry.
  * Component index regenerated via ``LFX_DEV=1`` (forces dynamic
    discovery; without it the script reproduces stale entries).
  * Integration test ``test_pilot_arxiv_upgrade.py`` mirroring the
    duckduckgo pilot suite; 5 tests pass against the workspace install.

PORTING.md updates fall out of running the recipe live:
  * Validate path is ``src/bundles/<bundle>/src/lfx_<bundle>``, not the
    bundle root (the manifest lives next to ``__init__.py``).
  * Index regen needs ``LFX_DEV=1`` to skip the prebuilt-index fast path.
  * Drop the migration-table JSON from the ruff invocation (ruff treats
    it as Python and complains about the top-level expression).

``scripts/migrate/port_bundle.py`` is the mechanical helper referenced
from § Automation: stdlib-only, dry-run by default, refuses on invalid
input, and intentionally leaves migration-table edits + integration-test
authoring to a human (release version + bare-name uniqueness require
judgement).  Three guard rails verified: invalid bundle name,
existing-target-bundle, missing in-tree provider.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 9, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d3bb8ea0-b4ed-46f1-87bf-3876667f88a6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/extension-pilot-arxiv

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the enhancement New feature or request label May 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 38%
38.08% (46110/121083) 67.45% (6307/9350) 37.69% (1057/2804)

Unit Test Results

Tests Skipped Failures Errors Time
4241 0 💤 0 ❌ 0 🔥 9m 35s ⏱️

@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 9, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 55.03%. Comparing base (56c14fb) to head (051e1e0).

Additional details and impacted files

Impacted file tree graph

@@                          Coverage Diff                          @@
##           feat/extension-production-install   #13047      +/-   ##
=====================================================================
+ Coverage                              54.98%   55.03%   +0.04%     
=====================================================================
  Files                                   2121     2121              
  Lines                                 195991   195281     -710     
  Branches                               30886    28051    -2835     
=====================================================================
- Hits                                  107773   107471     -302     
+ Misses                                 86971    86562     -409     
- Partials                                1247     1248       +1     
Flag Coverage Δ
backend 57.64% <ø> (+1.57%) ⬆️
frontend 54.99% <ø> (-0.29%) ⬇️
lfx 52.72% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 177 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 9, 2026
@github-actions github-actions Bot removed the enhancement New feature or request label May 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 10, 2026
…l' into feat/extension-pilot-arxiv

# Conflicts:
#	pyproject.toml
#	scripts/migrate/port_bundle.py
@erichare erichare requested a review from ogabrielluiz May 10, 2026 22:54
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 10, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 10, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 10, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 10, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 10, 2026
Copy link
Copy Markdown
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the thorough turnaround on this. dogfooded the script against a couple of candidates to verify:

  • dry-run on wikipedia (2 files, 2 classes: WikidataComponent, WikipediaComponent) produces: correct layout actions, 8 migration entries (4 per class), and a structurally complete test_pilot_wikipedia_upgrade.py scaffold. --display-name "Wikipedia" + --migration-release 1.10.0 flow through everything.
  • templates vs checked-in arxiv: rendering PACKAGE_INIT_TEMPLATE / COMPONENTS_INIT_TEMPLATE / EXTENSION_JSON_TEMPLATE against arxiv's discovered class set produces files that are structurally identical to what's in this PR. only differences are cosmetic (single-vs-double quotes in __all__, an extra docstring line in the package __init__.py, "(s)" plural marker in the description). a third port would land on a working bundle straight out of the script.
  • marker-based pyproject patching: # langflow-extensions:bundle-{deps,sources,members}-{start,end} pairs are in place; _insert_before_marker fails loudly if a marker goes missing. survives reordering and version-pin bumps as intended.

on my original validator-false-positive comment (PORTING.md line 297) — that one was a misread on my end. validate._has_build_method already accepts both def build AND Output(method="...") shapes (validate.py:378-397); the original doc footnote made it sound like a false positive when the validator was actually doing the right thing. your rewrite to describe the contract instead of papering over a warning is the better fix.

one leftover suggestion, not a blocker for this PR but worth filing: there's still no automated test that runs port_bundle.py --apply against a fixture and asserts the output matches an expected layout. the third bundle author becomes the first to find any drift between templates and the doc. a single tmp_path-based test that exercises _validate_candidate + _layout_bundle + _render_reexports + _render_migration_entries against a minimal fixture would close that loop. happy to send that as a follow-up if it'd help.

CI is showing Run Frontend Tests / Playwright Tests - Shard 37/70 failing — looks unrelated to this change but worth a re-run / glance.

LGTM otherwise. nice work.

@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 11, 2026
@erichare erichare merged commit 0e51b68 into feat/extension-production-install May 11, 2026
110 of 113 checks passed
@erichare erichare deleted the feat/extension-pilot-arxiv branch May 11, 2026 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants