Skip to content

feat(sil): Automated improvements from self-improvement loop#155

Open
glassBead-tc wants to merge 2 commits intomainfrom
improvement/sil-20260308-030809
Open

feat(sil): Automated improvements from self-improvement loop#155
glassBead-tc wants to merge 2 commits intomainfrom
improvement/sil-20260308-030809

Conversation

@glassBead-tc
Copy link
Copy Markdown
Member

Run summary:

  • Total iterations: 4
  • Successful: 3
  • Failed: 0

Generated by Self-Improvement Loop workflow.

Run summary:
- Total iterations: 4
- Successful: 3
- Failed: 0

Generated by Self-Improvement Loop workflow.
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Mar 9, 2026

🤖 Augment PR Summary

Summary: This PR applies automated changes from the Self‑Improvement Loop, focusing on hardening filesystem persistence, reducing promise-queue memory pressure, and improving type safety for operation catalogs.

Changes:

  • Added a run artifact (improvement-results.json) capturing SIL discoveries, plans, and evaluation results.
  • Introduced JSON Schema typing helpers (src/types/json-schema.ts) and used them to replace any in hub/gateway OperationDefinition schemas/examples.
  • Hardened filesystem storage path handling by validating/sanitizing session IDs, branch IDs, partition paths, and thought filenames before building file paths.
  • Updated session loading and integrity validation to validate thought filenames and sanitize branch IDs when traversing branch directories.
  • Adjusted in-memory Observatory session store queueing to periodically reset the internal update queue to mitigate chained-promise memory growth.
  • Adjusted ThoughtHandler to use typed response content (ThoughtResponseContent) and changed guide emission to embed resources (instead of resource_link) for new-branch guidance.
  • Updated thought processing serialization logic to include periodic queue reset for long-running workloads.

Technical Notes: Main risk areas are cross-platform path handling (especially Windows path semantics) and correctness of the new queue-reset logic under concurrent requests.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

}

// Check for valid UUID format or safe alphanumeric pattern only
const validIdPattern = /^[a-zA-Z0-9-_]+$/;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validIdPattern uses an unescaped - inside the character class ([0-9-_]), which in JS regex forms a range and can unintentionally allow characters like :/@/[ etc. That can undermine the path-traversal defense (e.g., drive-letter style paths on Windows); same issue exists in validPartitionPattern.

Severity: high

Other Locations
  • src/persistence/filesystem-storage.ts:87

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

actionResult: validatedInput.actionResult,
beliefs: validatedInput.beliefs,
assumptionChange: validatedInput.assumptionChange,
contextData: validatedInput.contextData,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resetting this.processingQueue after await this.processingQueue can overwrite a newer queue created by a concurrent processThought() call, which would allow operations to run concurrently and reintroduce race conditions. The same “reset the queue reference later” pattern in the reasoning channel’s updateQueue looks like it can also break serialization.

Severity: high

Other Locations
  • src/observatory/channels/reasoning.ts:57

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 9, 2026

Greptile Summary

Critical Issues Found

This PR applies three automated self-improvement iterations to the codebase with generally sound goals (security hardening, memory pressure relief, type safety), but introduces regressions that break core functionality:

1. Critical: Loops Catalog Emptied (src/resources/loops-content.ts)

The entire LOOPS_CATALOG (7135+ lines of content) has been deleted and replaced with an empty object. This was not documented in any of the 3 SIL iterations in improvement-results.json. The file header states "DO NOT EDIT MANUALLY - Generated by scripts/embed-loops.ts", confirming this was unintended. This breaks the loops resource feature used by agents to explore reasoning patterns — getCategories() returns [], and getLoop() calls will throw errors.

2. High: Queue Serialization Race Condition (src/observatory/channels/reasoning.ts)

The async .finally() reset fires asynchronously after the queue resolves, overwriting this.updateQueue with Promise.resolve() even though new operations have already been chained to the queue. This breaks the serialization guarantee and allows concurrent operations to run in parallel when they should be serialized.

3. High: Queue Reset Race Condition (src/thought-handler.ts lines 449-486)

The await-based queue reset allows concurrent callers to create a race: while one call is suspended in the reset, a second call chains to the old queue and updates processingQueue. When the first call resumes, it overwrites processingQueue = Promise.resolve(), breaking the serialization of the second call's operation.

4. Medium: Guidance Content Degraded (src/thought-handler.ts lines 913-929)

The parallel verification guide was downgraded from a resource_link (full guide fetched by agent) to an inline resource with a 2-line stub. The ResourceLinkContent type exists but was excluded from the ThoughtResponseContent union, forcing this degradation. Adding ResourceLinkContent to the union would preserve full guidance while maintaining type safety.

Positive Findings

  • Iteration 1 (path traversal security): Solid fixes with proper validation logic
  • Iteration 3 (type safety): Good JSON Schema types replacing unsafe any casts in critical files

Assessment

The SIL evaluation metrics (97-100% test pass rate) did not catch these regressions because: (a) no tests cover the empty-catalog case, (b) the 9 pre-existing test failures in iteration 2 may mask symptoms of queue race conditions.

Confidence Score: 1/5

  • Not safe to merge — contains an undocumented critical regression that empties the entire loops catalog and two queue race conditions that break concurrent request serialization.
  • This PR introduces three critical issues: (1) Data loss regression: All 7135+ lines of LOOPS_CATALOG content deleted without documentation, breaking the loops resource feature for all agents. This was never mentioned in any SIL iteration despite being in improvement-results.json. (2) Serialization race in reasoning.ts: Async .finally() reset overwrites the queue reference asynchronously, allowing operations to run concurrently when they should be serialized. (3) Serialization race in thought-handler.ts: await-based reset allows concurrent callers to lose queue references when the resetting call resumes. Both race conditions directly undermine the serialization guarantee that async/await implementation was meant to provide. (4) Content regression: Parallel verification guide downgraded from full resource_link to 2-line stub. The positive findings (path traversal security fixes, type safety improvements) are sound and well-tested, but cannot outweigh the regressions that break core functionality.
  • src/resources/loops-content.ts (critical — loops catalog must be restored), src/observatory/channels/reasoning.ts (queue serialization race must be fixed), src/thought-handler.ts (queue serialization race must be fixed; resource_link guidance should be restored)

Comments Outside Diff (3)

  1. src/resources/loops-content.ts, line 31-32 (link)

    Critical: LOOPS_CATALOG completely emptied — unintended data loss

    The entire LOOPS_CATALOG object, containing 7135+ lines of loop definitions across all categories (authoring, exploration, orchestration, etc.), has been removed and replaced with an empty object {}. This deletion was not listed as an experiment or code change in any of the 3 successful SIL iterations documented in improvement-results.json, strongly indicating this is an unintended side-effect of the automated process.

    This breaks core functionality: server-factory.ts imports and uses getCategories(), getLoopsInCategory(), and getLoop() to serve loop content to agents. With an empty catalog:

    • getCategories() returns []
    • Any call to getLoop(category, name) will throw Category not found: ...
    • The loop resource endpoint in server-factory.ts (line ~1592) will always return errors

    The file header explicitly states: DO NOT EDIT MANUALLY - Generated by scripts/embed-loops.ts. The content should be restored by re-running the embed script, or the original content from main should be preserved.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/resources/loops-content.ts
    Line: 31-32
    
    Comment:
    **Critical: LOOPS_CATALOG completely emptied — unintended data loss**
    
    The entire `LOOPS_CATALOG` object, containing 7135+ lines of loop definitions across all categories (authoring, exploration, orchestration, etc.), has been removed and replaced with an empty object `{}`. This deletion was **not listed as an experiment or code change in any of the 3 successful SIL iterations** documented in `improvement-results.json`, strongly indicating this is an unintended side-effect of the automated process.
    
    This breaks core functionality: `server-factory.ts` imports and uses `getCategories()`, `getLoopsInCategory()`, and `getLoop()` to serve loop content to agents. With an empty catalog:
    - `getCategories()` returns `[]`
    - Any call to `getLoop(category, name)` will throw `Category not found: ...`  
    - The loop resource endpoint in `server-factory.ts` (line ~1592) will always return errors
    
    The file header explicitly states: `DO NOT EDIT MANUALLY - Generated by scripts/embed-loops.ts`. The content should be restored by re-running the embed script, or the original content from `main` should be preserved.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. src/thought-handler.ts, line 449-486 (link)

    Queue reset with await introduces concurrent-call race condition

    Because processThought is async, when it hits the reset path at line 462 and executes await this.processingQueue, it yields execution. A concurrent call arriving during that yield will see the un-reset this.processingQueue and correctly chain to it. However, when the first call resumes after the await, it unconditionally overwrites this.processingQueue = Promise.resolve() at line 466 — stomping any reference the concurrent call had already stored:

    // Call A (counter=100): await this.processingQueue → suspends
    // Call B arrives: chains opB to old queue, sets this.processingQueue = Q
    // Call A resumes: this.processingQueue = Promise.resolve()  ← overwrites Q!
    // Call A: chains opA to Promise.resolve(), opA starts immediately
    // opA and opB now run concurrently — serialization broken
    

    The original pre-PR implementation was entirely synchronous and therefore safe. If memory pressure from long chains is a concern, a safer fix would be to periodically resolve the chain only when it is already settled (e.g., checking a flag set inside the .then() completion handler), without introducing any await in the queuing hot path.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/thought-handler.ts
    Line: 449-486
    
    Comment:
    **Queue reset with `await` introduces concurrent-call race condition**
    
    Because `processThought` is `async`, when it hits the reset path at line 462 and executes `await this.processingQueue`, it yields execution. A concurrent call arriving during that yield will see the un-reset `this.processingQueue` and correctly chain to it. However, when the first call resumes after the `await`, it unconditionally overwrites `this.processingQueue = Promise.resolve()` at line 466 — stomping any reference the concurrent call had already stored:
    
    ```
    // Call A (counter=100): await this.processingQueue → suspends
    // Call B arrives: chains opB to old queue, sets this.processingQueue = Q
    // Call A resumes: this.processingQueue = Promise.resolve()  ← overwrites Q!
    // Call A: chains opA to Promise.resolve(), opA starts immediately
    // opA and opB now run concurrently — serialization broken
    ```
    
    The original pre-PR implementation was entirely synchronous and therefore safe. If memory pressure from long chains is a concern, a safer fix would be to periodically resolve the chain only when it is already settled (e.g., checking a flag set inside the `.then()` completion handler), without introducing any `await` in the queuing hot path.
    
    How can I resolve this? If you propose a fix, please make it concise.
  3. src/thought-handler.ts, line 913-929 (link)

    resource_linkresource degrades guidance content quality

    The parallel verification guide was previously emitted as a resource_link, which directs the agent to fetch the full guide via its URI. It has been changed to an inline resource embedding only a two-line stub:

    text: "# Parallel Verification Guide\n\nWorkflow for exploring multiple hypotheses through branching"
    

    This means agents creating new branches now receive a minimal placeholder instead of the complete workflow guide. The ResourceLinkContent interface is already defined and exported (line 67-73) but is not included in the ThoughtResponseContent union (line 75). A cleaner fix would be to include it:

    export type ThoughtResponseContent = TextContent | ResourceContent | ResourceLinkContent;

    This preserves the full guidance while maintaining type safety for the union.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/thought-handler.ts
    Line: 913-929
    
    Comment:
    **`resource_link``resource` degrades guidance content quality**
    
    The parallel verification guide was previously emitted as a `resource_link`, which directs the agent to fetch the full guide via its URI. It has been changed to an inline `resource` embedding only a two-line stub:
    
    ```
    text: "# Parallel Verification Guide\n\nWorkflow for exploring multiple hypotheses through branching"
    ```
    
    This means agents creating new branches now receive a minimal placeholder instead of the complete workflow guide. The `ResourceLinkContent` interface is already defined and exported (line 67-73) but is not included in the `ThoughtResponseContent` union (line 75). A cleaner fix would be to include it:
    
    ```typescript
    export type ThoughtResponseContent = TextContent | ResourceContent | ResourceLinkContent;
    ```
    
    This preserves the full guidance while maintaining type safety for the union.
    
    How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 54fec26

Comment on lines 51 to +58
private queueUpdate<T>(operation: () => T | Promise<T>): Promise<T> {
const queued = this.updateQueue.then(() => operation());
this.updateQueue = queued.then(() => {}, () => {}); // Continue queue even on error
// Increment counter and reset queue periodically to prevent memory pressure
this.queueResetCounter++;
if (this.queueResetCounter >= this.QUEUE_RESET_INTERVAL) {
this.queueResetCounter = 0;
// Wait for current queue to complete, then reset
this.updateQueue.then(() => {}, () => {}).finally(() => {
this.updateQueue = Promise.resolve();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Async .finally() reset creates queue serialization race condition

The .finally() callback used to reset this.updateQueue fires asynchronously — after the current queue promise resolves. By that time, the code immediately after it has already chained the new operation and updated the queue:

// Time T (synchronous):
this.updateQueue.then(…).finally(() => {
  this.updateQueue = Promise.resolve();  // fires LATER at T+δ
});
const queued = this.updateQueue.then(…);  // chains to current queue NOW
this.updateQueue = queued.then(…);  // sets to Q

// Time T+δ (async, when old queue resolves):
// .finally fires and overwrites this.updateQueue = Promise.resolve()
// This DISCARDS the reference to Q!

Any operation arriving between T and T+δ is correctly chained to Q. But at T+δ, the .finally() overwrites this.updateQueue with Promise.resolve(). Subsequent operations then chain to Promise.resolve() instead of Q — running concurrently with operations still executing from Q. This breaks the serialization guarantee.

A safer approach: replace the .finally() reset with synchronous queue-chaining that only resets when already settled:

private queueUpdate<T>(operation: () => T | Promise<T>): Promise<T> {
  const queued = this.updateQueue.then(() => operation());
  this.updateQueue = queued.then(() => {}, () => {});  // Continue queue even on error
  return queued;
}

The 9 test failures noted in iteration-2's evaluation ("branch retrieval session setup issues") may be a symptom of this broken ordering.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/observatory/channels/reasoning.ts
Line: 51-58

Comment:
**Async `.finally()` reset creates queue serialization race condition**

The `.finally()` callback used to reset `this.updateQueue` fires asynchronously — after the current queue promise resolves. By that time, the code immediately after it has already chained the new operation and updated the queue:

```
// Time T (synchronous):
this.updateQueue.then(…).finally(() => {
  this.updateQueue = Promise.resolve();  // fires LATER at T+δ
});
const queued = this.updateQueue.then(…);  // chains to current queue NOW
this.updateQueue = queued.then(…);  // sets to Q

// Time T+δ (async, when old queue resolves):
// .finally fires and overwrites this.updateQueue = Promise.resolve()
// This DISCARDS the reference to Q!
```

Any operation arriving between T and T+δ is correctly chained to Q. But at T+δ, the `.finally()` overwrites `this.updateQueue` with `Promise.resolve()`. Subsequent operations then chain to `Promise.resolve()` instead of Q — running concurrently with operations still executing from Q. This breaks the serialization guarantee.

A safer approach: replace the `.finally()` reset with synchronous queue-chaining that only resets when already settled:

```typescript
private queueUpdate<T>(operation: () => T | Promise<T>): Promise<T> {
  const queued = this.updateQueue.then(() => operation());
  this.updateQueue = queued.then(() => {}, () => {});  // Continue queue even on error
  return queued;
}
```

The 9 test failures noted in iteration-2's evaluation ("branch retrieval session setup issues") may be a symptom of this broken ordering.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant