Skip to content

feat(redirect_links): SQLite-backed URL shortener for token-heavy links#870

Merged
senamakel merged 2 commits into
tinyhumansai:mainfrom
senamakel:feat/redirect-links
Apr 24, 2026
Merged

feat(redirect_links): SQLite-backed URL shortener for token-heavy links#870
senamakel merged 2 commits into
tinyhumansai:mainfrom
senamakel:feat/redirect-links

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented Apr 24, 2026

Summary

  • New src/openhuman/redirect_links/ domain that encodes long tracking URLs (e.g. trip.com/forward/...?bizData=eyJld...) to openhuman://link/<id> placeholders and expands them back before messages reach the user — saves tokens on every prompt that carries one of these URLs.
  • Global SQLite store at workspace_dir/redirect_links/links.db; content-addressed 8-char hex ids (SHA-256 prefix, widened on hash-prefix collision) so re-shortening the same URL is idempotent.
  • RPC surface: openhuman.redirect_links_{shorten,expand,list,remove,rewrite_inbound,rewrite_outbound}. Wired into the controller registry and namespace_description per the controller migration checklist.

Follow-ups (not in this PR): wire rewrite_inbound into the agent prompt-assembly path and rewrite_outbound into response delivery so the token savings actually land.

Test plan

  • cargo check --manifest-path Cargo.toml — clean (pre-existing unrelated warning only)
  • cargo test --lib openhuman::redirect_links — 15 new tests pass
    • shorten dedup + determinism, expand-bumps-hit-count, unknown id returns None
    • id_from_short scheme parsing + hex-only guard
    • list orders newest-first and respects limit; remove reports affected
    • rewrite_inbound: shortens long URLs, preserves surrounding text, trims trailing sentence punctuation, leaves short URLs untouched, handles multiple URLs
    • rewrite_outbound: full round-trip restores original text, leaves unknown ids unchanged
  • cargo test --lib core::all — 34 existing registry tests still pass (no duplicate methods, every handler has a declared schema, etc.)

Summary by CodeRabbit

  • New Features
    • Added redirect-links: create, expand, list, and remove shortened links.
    • Automatic inbound/outbound rewriting of URLs in text (convert between full URLs and openhuman://link/ placeholders).
  • Documentation
    • CLI/RPC descriptions updated to document redirect-link placeholder behavior and rewrite helpers.

New domain at `src/openhuman/redirect_links/` that encodes long tracking
URLs to `openhuman://link/<id>` placeholders and expands them back out
before messages reach the user. Keeps the full URL in a global SQLite
store (`workspace_dir/redirect_links/links.db`), uses content-addressed
8-char hex ids (SHA-256 prefix, lengthened on hash-prefix collision) so
re-shortening the same URL is idempotent.

RPC surface (`openhuman.redirect_links_*`): `shorten`, `expand`, `list`,
`remove`, `rewrite_inbound` (regex-based, min-length guard, trims
trailing sentence punctuation), `rewrite_outbound` (unknown ids pass
through unchanged). Wired into the controller registry and namespace
description.

Follows the controller migration checklist: `schemas.rs` declares the
registry, `mod.rs` re-exports `all_*_controller_schemas` and
`all_*_registered_controllers`, and `ops.rs` is re-exported as `rpc`.
15 unit tests cover store dedup/collision, expand bumps hit count,
inbound rewrite preserves surrounding text, trims trailing punctuation,
leaves short URLs alone, handles multiple URLs, outbound round-trip,
and unknown-id passthrough.
@senamakel senamakel requested a review from a team April 24, 2026 06:22
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4328293d-9074-42dd-a390-7a983e218636

📥 Commits

Reviewing files that changed from the base of the PR and between 8f7520a and edbade9.

📒 Files selected for processing (3)
  • src/core/all.rs
  • src/openhuman/redirect_links/ops.rs
  • src/openhuman/redirect_links/store.rs
✅ Files skipped from review due to trivial changes (1)
  • src/openhuman/redirect_links/store.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/core/all.rs

📝 Walkthrough

Walkthrough

Adds a new openhuman redirect_links feature: SQLite-backed URL shortening/expansion, text rewrite helpers (inbound/outbound), RPC endpoints and controller schemas, and registers the namespace in the core controller registry. Tests updated to cover behavior.

Changes

Cohort / File(s) Summary
Registry & Exports
src/core/all.rs, src/openhuman/mod.rs
Registers redirect_links namespace in global controller registry and exports the redirect_links module from the crate.
Module Entry
src/openhuman/redirect_links/mod.rs
New module that re-exports RPC namespace, core ops (shorten/expand/rewrite), schema aggregation, and public types.
Operations / RPC
src/openhuman/redirect_links/ops.rs
Implements URL shortening/expansion, inbound/outbound text rewriting with regex and punctuation trimming, and RPC handler functions for shorten/expand/list/remove/rewrite endpoints.
Controller Schemas
src/openhuman/redirect_links/schemas.rs
Adds controller schemas and registered controller entries for each RPC, JSON parameter parsing and bounds checks, and handlers that load config and invoke ops.
Storage Layer
src/openhuman/redirect_links/store.rs
SQLite-backed storage: schema init, deterministic SHA-256 id generation with collision handling, deduplication, hit-count/last-used updates, list/remove/peek semantics.
Types
src/openhuman/redirect_links/types.rs
Adds RedirectLink, RewriteReplacement, and RewriteResult structs (Serde-enabled) used across store/ops/schemas.

Sequence Diagram(s)

sequenceDiagram
    participant Client as RPC Client
    participant Handler as Schema Handler
    participant Ops as Redirect Ops
    participant Store as SQLite Store

    rect rgba(100,150,200,0.5)
    Note over Client,Store: Shorten Flow
    Client->>Handler: rl_shorten(url)
    Handler->>Handler: load config
    Handler->>Ops: shorten_url(url)
    Ops->>Ops: regex match & trim punctuation
    Ops->>Store: shorten(url)
    Store->>Store: sha256 id gen, insert/dedup
    Store-->>Ops: RedirectLink
    Ops-->>Handler: RedirectLink
    Handler-->>Client: RpcOutcome<RedirectLink>
    end

    rect rgba(150,200,100,0.5)
    Note over Client,Store: Expand / Rewrite Outbound
    Client->>Handler: rl_rewrite_outbound(text)
    Handler->>Ops: rewrite_outbound(text)
    Ops->>Ops: find openhuman://link/<id> tokens
    Ops->>Store: expand(id) per match
    Store->>Store: lookup, increment hit_count, update last_used_at
    Store-->>Ops: URL or None
    Ops-->>Handler: RewriteResult{text, replacements}
    Handler-->>Client: RpcOutcome<RewriteResult>
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

Poem

🐰 In the burrow I bounce and I think,
I snip long links to openhuman://link/<id> in a blink,
SHA-256 seeds the tiny key,
SQLite guards each memory,
Hop! Rewrites done — now off to wink. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(redirect_links): SQLite-backed URL shortener for token-heavy links' accurately and clearly describes the main change—a new SQLite-backed URL shortening feature for the redirect_links domain that compresses long token-heavy URLs into openhuman://link/ placeholders.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
src/core/all.rs (1)

271-273: Add a direct test assertion for the new namespace description.

Line 271 adds new behavior in namespace_description, but namespace_description_known_namespaces doesn’t currently assert redirect_links.

Suggested test addition
 #[test]
 fn namespace_description_known_namespaces() {
     assert!(namespace_description("memory").is_some());
     assert!(namespace_description("memory_tree").is_some());
+    assert!(namespace_description("redirect_links").is_some());
     assert!(namespace_description("billing").is_some());
     assert!(namespace_description("config").is_some());
As per coding guidelines: "Ship unit tests and coverage for behavior you are adding or changing before building additional features on top".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/all.rs` around lines 271 - 273, The test suite is missing a direct
assertion for the new "redirect_links" entry added to namespace_description;
update the test named namespace_description_known_namespaces to include an
assertion that the returned known namespaces list (from namespace_description or
the function under test) contains "redirect_links" and that its description
string matches the new value, ensuring the test checks both presence and exact
description text for "redirect_links".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/redirect_links/ops.rs`:
- Around line 127-189: The RPC handlers (rl_shorten, rl_expand, rl_list,
rl_remove, rl_rewrite_inbound, rl_rewrite_outbound) currently emit raw URLs/IDs
in logs; change logging to never include full URLs or querystrings and use a
stable grep-friendly prefix like "[rpc][redirect_links]". Implement or call a
helper (e.g., redact_url) that strips query and sensitive fragments (or returns
only host + path or a fixed "<redacted_url>") and use that instead of link.url /
link.short_url / raw ids in the format! calls inside RpcOutcome::single_log and
RpcOutcome::new; update all log messages to the consistent prefix and redacted
values (e.g., "[rpc][redirect_links] shortened: {redacted_original} ->
{redacted_short}"). Ensure rl_list and rewrite logs also use redacted summaries
(count or redacted samples) rather than full URLs.

In `@src/openhuman/redirect_links/store.rs`:
- Around line 18-27: The function id_from_short currently only handles strings
with the SHORT_URL_PREFIX; update it to also accept a bare hex id by, after
trimming, first checking if trimmed starts with SHORT_URL_PREFIX and validating
the suffix as hex (existing logic), and if not, validating that the entire
trimmed string is a non-empty ASCII hex string and returning it; reference the
id_from_short function and SHORT_URL_PREFIX constant when making the change.
- Around line 41-67: The shorten flow has a check-then-insert race: find_by_url
and get_by_id are non-atomic so concurrent calls can both try to insert the same
id and one will hit a PRIMARY KEY error; replace the INSERT with an idempotent
insert using "INSERT ... ON CONFLICT(id) DO NOTHING" (or the SQLite equivalent)
inside the loop where conn.execute is called, then after the insert run a
post-insert lookup (e.g., call find_by_url(conn, url) and/or get_by_id(conn,
&id)) to determine whether the insert succeeded or an existing row is present
and return that row; if the existing row has a different url (hash-prefix
collision) continue the len += 2 loop as before. Ensure you remove or avoid
relying on the separate pre-insert find_by_url/read checks and handle all
outcomes (inserted, found-by-url, found-by-id-with-different-url) via these
atomic insert + post-check steps in the shorten implementation.

---

Nitpick comments:
In `@src/core/all.rs`:
- Around line 271-273: The test suite is missing a direct assertion for the new
"redirect_links" entry added to namespace_description; update the test named
namespace_description_known_namespaces to include an assertion that the returned
known namespaces list (from namespace_description or the function under test)
contains "redirect_links" and that its description string matches the new value,
ensuring the test checks both presence and exact description text for
"redirect_links".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 09132052-d4cf-4fb6-bbea-a219b0e80fb5

📥 Commits

Reviewing files that changed from the base of the PR and between 478c92e and 8f7520a.

📒 Files selected for processing (7)
  • src/core/all.rs
  • src/openhuman/mod.rs
  • src/openhuman/redirect_links/mod.rs
  • src/openhuman/redirect_links/ops.rs
  • src/openhuman/redirect_links/schemas.rs
  • src/openhuman/redirect_links/store.rs
  • src/openhuman/redirect_links/types.rs

Comment thread src/openhuman/redirect_links/ops.rs
Comment thread src/openhuman/redirect_links/store.rs Outdated
Comment thread src/openhuman/redirect_links/store.rs Outdated
- store.rs: replace check-then-insert with atomic `ON CONFLICT DO NOTHING`
  + post-verify so concurrent `shorten()` callers for the same URL don't
  race into a PRIMARY KEY constraint error. Add a threaded regression test.
- store.rs: implement the bare-id path of `id_from_short` promised by its
  docstring (accepts both `openhuman://link/<id>` and bare hex; normalizes
  to lowercase). Add a test.
- ops.rs: drop raw URLs from RPC logs (full query strings can carry
  tracking identifiers / PII) and switch every log line to the stable
  grep-friendly prefix `[redirect_links][rpc][<fn>]`.
- core/all.rs: extend `namespace_description_known_namespaces` to assert
  the new `redirect_links` entry.
@senamakel senamakel merged commit b81d04d into tinyhumansai:main Apr 24, 2026
7 checks passed
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant