Skip to content

feat: add hippo-embed workflow + recurring embed step to daily-hippo-learn#28178

Merged
pelikhan merged 3 commits intomainfrom
copilot/deep-report-embed-hippo-memory-store
Apr 24, 2026
Merged

feat: add hippo-embed workflow + recurring embed step to daily-hippo-learn#28178
pelikhan merged 3 commits intomainfrom
copilot/deep-report-embed-hippo-memory-store

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 23, 2026

The Hippo memory store has ~490 memories but fewer than 1% are vector-embedded, making semantic recall effectively non-functional. Root cause: hippo embed (and its required dep @xenova/transformers) was never wired into any workflow.

Changes

  • New hippo-embed.mdworkflow_dispatch-only maintenance workflow for the one-time fix, running on aw-gpu-runner-T4:

    1. hippo audit --fix — prunes flagged junk entries before indexing
    2. npm install -g @xenova/transformers — installs the embedding backend as a CI step (before the agent runs)
    3. hippo embed — generates vectors for all memories
    4. hippo status — reports coverage delta
  • daily-hippo-learn.md — adds Step 2.5 after hippo sleep to run hippo embed on every daily cycle, keeping the vector index current as new memories accumulate:

    mcpscripts-hippo args: "embed"
    

    This is incremental — only unembedded memories are processed, so daily overhead is minimal.

- New hippo-embed.md: one-time workflow_dispatch maintenance workflow
  that audits (hippo audit --fix) and embeds (hippo embed) the Hippo
  memory store to restore semantic recall quality from < 1% coverage
- Updated daily-hippo-learn.md: adds Step 2.5 to run hippo embed after
  every sleep cycle, keeping vector index current going forward
- Compiled lock files regenerated for both workflows

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/ef7faf5b-5137-499d-adbd-ce805709fd6f

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Embed Hippo Memory store to restore semantic recall quality feat: add hippo-embed workflow + recurring embed step to daily-hippo-learn Apr 23, 2026
Copilot AI requested a review from pelikhan April 23, 2026 22:47
Copy link
Copy Markdown
Collaborator

@pelikhan pelikhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot install transformers as a step. Use GPU runner (same as daily-news)

- Add runs-on: aw-gpu-runner-T4 (same runner as daily-news)
- Move @xenova/transformers install into a frontmatter steps: block
  so it runs as a CI step before the agent, not via agent instruction
- Remove the install section from the agent prompt

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/eaf7ade0-e041-429c-b704-16960fa6cbcc

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 24, 2026

@copilot install transformers as a step. Use GPU runner (same as daily-news)

Done in 9445fe0. @xenova/transformers is now installed in a steps: block so it runs as a CI step before the agent. Also added runs-on: aw-gpu-runner-T4.

@github-actions
Copy link
Copy Markdown
Contributor

Hey @Copilot 👋 — great work tackling the vector-embedding gap in the Hippo memory store! Fixing semantic recall by wiring up hippo embed into both a one-time maintenance workflow and the daily cycle is a solid, well-scoped change.

One thing worth addressing before this moves out of draft:

  • No test coverage — the new hippo-embed.md workflow and the daily-hippo-learn.md change don't appear to have any corresponding test files. Even a basic smoke test or a unit test validating the workflow spec structure would help CI catch regressions.

If you'd like a hand, here's a prompt to get started:

Add unit or integration tests for the hippo embed workflow changes introduced in this PR:
1. Write a test that validates the structure/schema of `.github/workflows/hippo-embed.md` (e.g., required steps are present, the workflow_dispatch trigger is set, the correct runner is specified).
2. Write a test that validates the new Step 2.5 in `.github/workflows/daily-hippo-learn.md` is present and calls `hippo embed` correctly.
Place tests in the appropriate test directory following the existing test conventions for this repo.

Generated by Contribution Check · ● 1.6M ·

@pelikhan pelikhan marked this pull request as ready for review April 24, 2026 02:11
Copilot AI review requested due to automatic review settings April 24, 2026 02:11
@pelikhan pelikhan merged commit 7dfb739 into main Apr 24, 2026
19 checks passed
@pelikhan pelikhan deleted the copilot/deep-report-embed-hippo-memory-store branch April 24, 2026 02:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Hippo embedding maintenance workflow and introduces a recurring embedding step to keep the Hippo memory store’s vector index up to date for semantic recall.

Changes:

  • Introduces a new Hippo Embed workflow (workflow_dispatch only) that audits/prunes low-quality entries and embeds all memories.
  • Updates the daily Hippo learning workflow instructions to run hippo embed after each hippo sleep.
  • Adds the new workflow to the agent-factory status documentation table.
Show a summary per file
File Description
docs/src/content/docs/agent-factory-status.mdx Adds the Hippo Embed workflow to the published workflow/status list.
.github/workflows/hippo-embed.md New maintenance workflow definition and agent instructions for audit + embed + status reporting.
.github/workflows/hippo-embed.lock.yml Compiled/locked workflow artifact for hippo-embed.md.
.github/workflows/daily-hippo-learn.md Adds an “embed” step after the daily “sleep” consolidation cycle.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 2

Comment on lines +93 to +100
## Step 2.5 — Refresh embeddings

Keep the vector index current so semantic recall stays sharp. Run after every sleep
cycle to embed any memories that were added or updated since the last embed pass:

```
mcpscripts-hippo args: "embed"
```
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new mcpscripts-hippo args: "embed" step will fail in this workflow because @xenova/transformers (required by hippo embed, per the PR description) is never installed here. Add a setup steps: entry (similar to hippo-embed.md) to install the dependency, and re-run the workflow compiler so the change is reflected in daily-hippo-learn.lock.yml (the lock file is what GitHub Actions executes).

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +42
steps:
- name: Install @xenova/transformers
run: |
npm install -g @xenova/transformers

imports:
- shared/hippo-memory.md
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xenova/transformers is installed here, but the same dependency is also needed by the recurring embed step added to daily-hippo-learn.md. To avoid duplicating setup across workflows (and to ensure future Hippo workflows don’t forget it), consider moving this install into a shared import (e.g., extend shared/hippo-memory.md or add a new shared hippo-embeddings.md import) and reusing it in both workflows.

Copilot uses AI. Check for mistakes.
@github-actions github-actions Bot mentioned this pull request Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[deep-report] Embed Hippo Memory store to restore semantic recall quality

3 participants