Skip to content

test: comprehensive e2e and kittest UI testing infrastructure#778

Draft
thepastaclaw wants to merge 10 commits into
dashpay:v1.0-devfrom
thepastaclaw:test/e2e-rebase
Draft

test: comprehensive e2e and kittest UI testing infrastructure#778
thepastaclaw wants to merge 10 commits into
dashpay:v1.0-devfrom
thepastaclaw:test/e2e-rebase

Conversation

@thepastaclaw
Copy link
Copy Markdown
Collaborator

Summary

Comprehensive end-to-end testing infrastructure for DET. Adds a phased e2e test suite that exercises the full user journey (wallet import → SPV sync → balance verification → platform reads → token search → identity validation → teardown) and kittest interaction tests for welcome screen behavior and navigation.

Continuation of #598, rebased onto current v1.0-dev.

E2E Test Suite (cargo test --features e2e --test e2e -- --ignored --nocapture)

  • Phase-based architecture: smoke → setup → wallet UI → wallet ops → platform reads → token search → identity validation → DPNS (skipped) → teardown
  • Feature-gated: e2e Cargo feature gates test-only visibility on internal types
  • Reusable harness: TestContext for cross-phase state, error classification with retry logic, SPV readiness gating, panic-safe cleanup
  • CI workflow: .github/workflows/e2e.yml with environment-gated mnemonic

Kittest Interaction Tests

  • Welcome screen click tests (Explore, Create Wallet, Import Wallet)
  • Navigation across all major screens
  • Resize/stability stress tests

Source Changes

  • hint_text on TextEdit widgets for accessibility/test discovery
  • #[cfg(feature = "e2e")] accessor methods on screen types (no production API changes)
  • Button label "Add Contracts" → "Fetch Contracts" to disambiguate from breadcrumb
  • Drop impl on ImportMnemonicScreen to zeroize seed words

Test Results

  • 90 tests pass, 0 failures
  • Clippy clean with --all-features --all-targets
  • E2E testnet journey runs in ~82s (when not ignored)

Skipped (pending SPV mempool support)

  • Send-to-self transaction
  • Identity creation (needs asset lock proof)
  • DPNS registration (depends on identity)

Validation

  • cargo check --all-features
  • cargo clippy --all-features --all-targets -- -D warnings
  • cargo test --all-features — 90 passed, 0 failed ✅

Co-authored-by: PastaPastaPasta pasta@dashboost.org

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 19, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f69eb950-8dfc-4a50-97e6-81089c666676

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

PR #778 adds a comprehensive E2E and kittest UI testing infrastructure. The code is well-structured with phase-based execution, retry logic, and emergency cleanup. However, there is one confirmed blocking bug: the E2E test clicks a 'Fetch Contracts' button that doesn't exist (the actual label is 'Add Contracts'). The remaining findings are architectural suggestions around test isolation and assertion quality.

Reviewed commit: 9b2e1c8

🔴 1 blocking | 🟡 6 suggestion(s) | 💬 1 nitpick(s)

3 additional findings

🟡 suggestion: Round-trip assertion is trivially satisfied before return payment arrives

tests/backend-e2e/send_funds.rs (lines 107-115)

Wallet A starts with 5M duffs, sends 2M to B, leaving ~3M. The wait_for_balance check at line 108 waits for A to reach send_amount (2M). Since A already has ~3M after sending, this predicate is true immediately — the test never actually verifies that B's return payment arrived. To properly verify the round trip, capture A's balance after the send and wait for it to increase.

💡 Suggested change
    // Capture A's balance after the send, then wait for it to increase
    let a_balance_after_send = {
        let wallets = app_context.wallets().read().expect("lock");
        wallets.get(&hash_a).map(|w| w.read().expect("lock").total_balance_duffs()).unwrap_or(0)
    };

    // Verify A received the return payment
    let a_balance_after_return = wait_for_balance(
        app_context,
        hash_a,
        a_balance_after_send + 1, // Any increase proves B->A arrived
        Duration::from_secs(120),
    )
    .await
    .expect("A should receive return funds from B");
🟡 suggestion: Withdrawal test only checks task result type, not actual state change

tests/backend-e2e/identity_withdraw.rs (lines 52-61)

The test verifies BackendTaskSuccessResult::WithdrewFromIdentity was returned but never checks that the identity's balance actually decreased or that the withdrawal amount matches expectations. While the task succeeding proves Platform accepted the state transition, verifying the balance delta would catch edge cases like partial withdrawals or fee miscalculations. At minimum, re-query the identity balance and assert it decreased by approximately withdraw_amount.

🟡 suggestion: Backend E2E suite commented out in CI — new test code can regress silently

.github/workflows/tests.yml (lines 83-92)

The backend E2E test execution is commented out with a TODO: "Re-enable after backend E2E tests are updated for TaskError migration". While these tests are #[ignore] and need network access, the fact that this PR adds/modifies them but they aren't exercised in CI means compilation failures or logic issues won't be caught. At minimum, consider running them with --ignored in a separate CI job (even if allowed to fail), or ensure cargo test --test backend-e2e --all-features (without --ignored) compiles successfully.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/e2e/phases/phase_03_platform.rs`:
- [BLOCKING] lines 118-123: E2E test clicks non-existent 'Fetch Contracts' button — will panic every run
  The test searches for a button labeled `"Fetch Contracts"` (line 120), but `add_contracts_screen.rs:339` renders the button as `"Add Contracts"`. The `.expect()` on line 122 will panic with `'Fetch Contracts' button must be visible on Add Contracts screen` every time this phase runs.

In `src/ui/mod.rs`:
- [SUGGESTION] line 86: Module visibility widened from pub(crate) to pub for test access
  The `identities` module was changed from `pub(crate)` to `pub`, exposing internal UI types in the public API. The `wallets` module (line 91) correctly stays `pub(crate)` and uses `#[cfg(feature = "e2e")]` accessor methods instead. Phase 06 needs `RegisterDpnsNameSource` — this could be re-exported from `src/ui/mod.rs` (it's already referenced by `ScreenType`) and the identities module kept `pub(crate)`. The direct field access on `RegisterDpnsNameScreen` in phase_06 (lines 43-65) should use cfg-gated accessors like `AddNewIdentityScreen` does.

In `tests/backend-e2e/send_funds.rs`:
- [SUGGESTION] lines 107-115: Round-trip assertion is trivially satisfied before return payment arrives
  Wallet A starts with 5M duffs, sends 2M to B, leaving ~3M. The `wait_for_balance` check at line 108 waits for A to reach `send_amount` (2M). Since A already has ~3M after sending, this predicate is true immediately — the test never actually verifies that B's return payment arrived. To properly verify the round trip, capture A's balance after the send and wait for it to increase.

In `tests/kittest/interactions.rs`:
- [SUGGESTION] lines 16-21: Kittest tests may use production database when run without --all-features
  The kittest tests call `AppState::new()` which uses the production database unless the `testing` feature is active. Unlike the E2E tests (which have `required-features = ["e2e"]` in Cargo.toml), kittest has no `[[test]]` section enforcing `required-features = ["testing"]`. Running `cargo test --test kittest` without `--all-features` would read the real app data directory and database. CI is fine (uses `--all-features`), but developers could hit this. Consider adding a `[[test]]` section with `required-features = ["testing"]`.

In `tests/backend-e2e/identity_withdraw.rs`:
- [SUGGESTION] lines 52-61: Withdrawal test only checks task result type, not actual state change
  The test verifies `BackendTaskSuccessResult::WithdrewFromIdentity` was returned but never checks that the identity's balance actually decreased or that the withdrawal amount matches expectations. While the task succeeding proves Platform accepted the state transition, verifying the balance delta would catch edge cases like partial withdrawals or fee miscalculations. At minimum, re-query the identity balance and assert it decreased by approximately `withdraw_amount`.

In `.github/workflows/tests.yml`:
- [SUGGESTION] lines 83-92: Backend E2E suite commented out in CI — new test code can regress silently
  The backend E2E test execution is commented out with a TODO: "Re-enable after backend E2E tests are updated for TaskError migration". While these tests are `#[ignore]` and need network access, the fact that this PR adds/modifies them but they aren't exercised in CI means compilation failures or logic issues won't be caught. At minimum, consider running them with `--ignored` in a separate CI job (even if allowed to fail), or ensure `cargo test --test backend-e2e --all-features` (without `--ignored`) compiles successfully.

In `tests/e2e/phases/phase_04_tokens.rs`:
- [SUGGESTION] lines 9-62: Token search retry may accumulate text in input field
  On retry, `navigate_to_screen` resets to the root token search screen but root screens persist across navigation (stored in `main_screens` BTreeMap). The text input from the previous attempt may still contain "dash". When `type_into_text_input(harness, 0, "dash")` runs on attempt 2, it appends to the existing text, producing "dashdash". This depends on whether `refresh_on_arrival()` clears the search input on `TokenSearchScreen`. If it doesn't, the retry will search for a concatenated string.

Comment on lines +118 to +123
// Click "Fetch Contracts" submit button
harness
.query_by_label("Fetch Contracts")
.or_else(|| harness.query_by_label_contains("Fetch Contracts"))
.expect("'Fetch Contracts' button must be visible on Add Contracts screen")
.click();
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking: E2E test clicks non-existent 'Fetch Contracts' button — will panic every run

The test searches for a button labeled "Fetch Contracts" (line 120), but add_contracts_screen.rs:339 renders the button as "Add Contracts". The .expect() on line 122 will panic with 'Fetch Contracts' button must be visible on Add Contracts screen every time this phase runs.

💡 Suggested change
Suggested change
// Click "Fetch Contracts" submit button
harness
.query_by_label("Fetch Contracts")
.or_else(|| harness.query_by_label_contains("Fetch Contracts"))
.expect("'Fetch Contracts' button must be visible on Add Contracts screen")
.click();
// Click "Add Contracts" submit button
harness
.query_by_label("Add Contracts")
.or_else(|| harness.query_by_label_contains("Add Contracts"))
.expect("'Add Contracts' button must be visible on Add Contracts screen")
.click();

source: ['claude-general', 'codex-general']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/e2e/phases/phase_03_platform.rs`:
- [BLOCKING] lines 118-123: E2E test clicks non-existent 'Fetch Contracts' button — will panic every run
  The test searches for a button labeled `"Fetch Contracts"` (line 120), but `add_contracts_screen.rs:339` renders the button as `"Add Contracts"`. The `.expect()` on line 122 will panic with `'Fetch Contracts' button must be visible on Add Contracts screen` every time this phase runs.

Comment thread src/ui/mod.rs
pub mod dpns;
pub mod helpers;
pub(crate) mod identities;
pub mod identities;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Module visibility widened from pub(crate) to pub for test access

The identities module was changed from pub(crate) to pub, exposing internal UI types in the public API. The wallets module (line 91) correctly stays pub(crate) and uses #[cfg(feature = "e2e")] accessor methods instead. Phase 06 needs RegisterDpnsNameSource — this could be re-exported from src/ui/mod.rs (it's already referenced by ScreenType) and the identities module kept pub(crate). The direct field access on RegisterDpnsNameScreen in phase_06 (lines 43-65) should use cfg-gated accessors like AddNewIdentityScreen does.

source: ['claude-general', 'claude-rust-quality']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/ui/mod.rs`:
- [SUGGESTION] line 86: Module visibility widened from pub(crate) to pub for test access
  The `identities` module was changed from `pub(crate)` to `pub`, exposing internal UI types in the public API. The `wallets` module (line 91) correctly stays `pub(crate)` and uses `#[cfg(feature = "e2e")]` accessor methods instead. Phase 06 needs `RegisterDpnsNameSource` — this could be re-exported from `src/ui/mod.rs` (it's already referenced by `ScreenType`) and the identities module kept `pub(crate)`. The direct field access on `RegisterDpnsNameScreen` in phase_06 (lines 43-65) should use cfg-gated accessors like `AddNewIdentityScreen` does.

Comment on lines +16 to +21
let harness = Harness::builder().with_max_steps(200).build_eframe(|ctx| {
dash_evo_tool::app::AppState::new(ctx.egui_ctx.clone())
.expect("Failed to create AppState")
.with_animations(false)
});
(rt, harness)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Kittest tests may use production database when run without --all-features

The kittest tests call AppState::new() which uses the production database unless the testing feature is active. Unlike the E2E tests (which have required-features = ["e2e"] in Cargo.toml), kittest has no [[test]] section enforcing required-features = ["testing"]. Running cargo test --test kittest without --all-features would read the real app data directory and database. CI is fine (uses --all-features), but developers could hit this. Consider adding a [[test]] section with required-features = ["testing"].

source: ['codex-general', 'codex-rust-quality']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/kittest/interactions.rs`:
- [SUGGESTION] lines 16-21: Kittest tests may use production database when run without --all-features
  The kittest tests call `AppState::new()` which uses the production database unless the `testing` feature is active. Unlike the E2E tests (which have `required-features = ["e2e"]` in Cargo.toml), kittest has no `[[test]]` section enforcing `required-features = ["testing"]`. Running `cargo test --test kittest` without `--all-features` would read the real app data directory and database. CI is fine (uses `--all-features`), but developers could hit this. Consider adding a `[[test]]` section with `required-features = ["testing"]`.

Comment on lines +9 to +62
for attempt in 1..=PLATFORM_MAX_RETRIES {
// ─── 1. Navigate to token search screen (fresh each attempt) ───
navigate_to_screen(harness, RootScreenType::RootScreenTokenSearch);

let on_screen =
wait_for_label(harness, "Enter Keyword", std::time::Duration::from_secs(10));
assert!(
on_screen,
"'Enter Keyword' label must be visible on token search screen"
);

if attempt == 1 {
println!(" Navigated to token search screen");
}

type_into_text_input(harness, 0, "dash");
println!(
" Typed 'dash' in search input (attempt {}/{})",
attempt, PLATFORM_MAX_RETRIES
);

// ─── 2. Click search button and wait for results ───────────────
harness
.query_by_label("Search")
.expect("Search button must be visible on token search screen")
.click();
harness.run_steps(SETTLE_STEPS);

let completed = wait_until(
harness,
|h| {
h.query_by_label_contains("Contract ID").is_some()
|| h.query_by_label_contains("No tokens match").is_some()
|| h.query_by_label_contains("Error").is_some()
},
TOKEN_SEARCH_TIMEOUT,
POLL_STEPS,
);

if !completed {
println!(
" Token search timed out after {}s (attempt {}/{})",
TOKEN_SEARCH_TIMEOUT.as_secs(),
attempt,
PLATFORM_MAX_RETRIES
);
if attempt == PLATFORM_MAX_RETRIES {
panic!(
"Token search timed out after {} attempts ({}s each)",
PLATFORM_MAX_RETRIES,
TOKEN_SEARCH_TIMEOUT.as_secs()
);
}
continue;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Token search retry may accumulate text in input field

On retry, navigate_to_screen resets to the root token search screen but root screens persist across navigation (stored in main_screens BTreeMap). The text input from the previous attempt may still contain "dash". When type_into_text_input(harness, 0, "dash") runs on attempt 2, it appends to the existing text, producing "dashdash". This depends on whether refresh_on_arrival() clears the search input on TokenSearchScreen. If it doesn't, the retry will search for a concatenated string.

source: ['claude-general']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/e2e/phases/phase_04_tokens.rs`:
- [SUGGESTION] lines 9-62: Token search retry may accumulate text in input field
  On retry, `navigate_to_screen` resets to the root token search screen but root screens persist across navigation (stored in `main_screens` BTreeMap). The text input from the previous attempt may still contain "dash". When `type_into_text_input(harness, 0, "dash")` runs on attempt 2, it appends to the existing text, producing "dashdash". This depends on whether `refresh_on_arrival()` clears the search input on `TokenSearchScreen`. If it doesn't, the retry will search for a concatenated string.

Comment thread tests/frontend-e2e/helpers/harness.rs Outdated
Comment on lines +61 to +108
const ERROR_PATTERNS: &[(ErrorCategory, &[&str])] = &[
(
ErrorCategory::Network,
&[
"timeout",
"connection",
"network",
"unavailable",
"timed out",
"refused",
"unreachable",
],
),
(
ErrorCategory::Validation,
&[
"invalid",
"insufficient",
"already exists",
"not found",
"duplicate",
"too low",
"too high",
],
),
(
ErrorCategory::TransientPlatform,
&[
"consensus",
"retry",
"temporarily",
"try again",
"rate limit",
"internal error",
"transport error",
],
),
];

pub fn classify_error(error_text: &str) -> ErrorCategory {
let lower = error_text.to_lowercase();
for (category, patterns) in ERROR_PATTERNS {
if patterns.iter().any(|p| lower.contains(p)) {
return *category;
}
}
ErrorCategory::Fatal
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Error classification intentionally biases toward retry — document the trade-off

The priority-ordered pattern matching (Network checked before Validation) means a validation error containing "connection" or "network" would be misclassified as retryable. The comment at lines 57-60 documents this as intentional for E2E resilience. This is a reasonable trade-off for test infrastructure, but consider adding an example to the comment (e.g., "invalid connection parameter" would classify as Network) so future maintainers understand the edge case.

source: ['claude-general', 'claude-rust-quality']

@lklimek
Copy link
Copy Markdown
Contributor

lklimek commented Mar 20, 2026

claw can we rename directory to frontend-e2e ? we already have backend-e2e, it will be consistent.

@thepastaclaw
Copy link
Copy Markdown
Collaborator Author

Done — renamed tests/e2etests/frontend-e2e and updated all references (Cargo.toml test target, CLAUDE.md, CONTRIBUTING.md, doc comments). Also rebased onto current v1.0-dev for clean history. Force-pushed.

thepastaclaw and others added 2 commits March 23, 2026 13:28
Implements a multi-phase E2E test suite for wallet import, SPV sync,
wallet UI operations, platform reads, token search, identity creation,
DPNS registration, and teardown. Also ports kittest interactions and
hardens existing test infrastructure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aligns the frontend E2E test directory name with the existing
tests/backend-e2e/ convention, as requested in PR review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@thepastaclaw
Copy link
Copy Markdown
Collaborator Author

thepastaclaw commented Mar 30, 2026

✅ Review complete (commit 9eda0d8)

Remove all #[cfg(feature = "e2e")] blocks from screen types per review
feedback. Wallet import in e2e tests now goes through
AppContext::register_wallet() instead of manipulating
ImportMnemonicScreen fields. AddNewIdentityScreen and
RegisterDpnsNameScreen accessors are now unconditionally public.

Remove the `e2e` feature flag from Cargo.toml since no source files
use it anymore.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Self-review of testing infrastructure PR. Verified 2 findings: (1) wallet reuse keyed only by alias creates a destructive collision risk — teardown deletes whatever wallet matched, even if it's a preexisting non-test wallet; (2) the identity/DPNS phase is explicitly skipped, so the PR description overstates e2e coverage.

Reviewed commit: e90cc8b

🟡 2 suggestion(s)

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_00_setup.rs`:
- [SUGGESTION] lines 16-110: E2E wallet reuse is keyed only by alias, so the suite can adopt and later delete an unrelated wallet
  `find_existing_e2e_wallet()` reuses the first wallet whose `alias` equals the fixed string `"E2E Test Wallet"`; it does not verify the mnemonic, seed hash, or any other test-owned marker before setting `ctx.wallet_seed_hash` and `ctx.wallet_reused` ([tests/frontend-e2e/phases/phase_00_setup.rs#L16](tests/frontend-e2e/phases/phase_00_setup.rs#L16), [tests/frontend-e2e/phases/phase_00_setup.rs#L110](tests/frontend-e2e/phases/phase_00_setup.rs#L110)). Teardown then unconditionally removes whatever seed hash is stored in the context, and the panic-path `emergency_cleanup()` does the same, with no guard for `ctx.wallet_reused` ([tests/frontend-e2e/phases/phase_07_teardown.rs#L47](tests/frontend-e2e/phases/phase_07_teardown.rs#L47), [tests/frontend-e2e/helpers/harness.rs#L414](tests/frontend-e2e/helpers/harness.rs#L414)). That makes alias collisions destructive: a preexisting non-test wallet with that alias can be hijacked for the run and then deleted from the DB.

In `tests/frontend-e2e/phases/phase_05_identity.rs`:
- [SUGGESTION] line 15: The phased E2E journey does not execute the identity/DPNS flow it claims to add coverage for
  The main E2E entrypoint runs Phase 5, but `phase_05_identity::run()` explicitly skips actual identity creation and only performs client-side validation checks, so it never sets `ctx.identity_id` ([tests/frontend-e2e/phases/phase_05_identity.rs#L15](tests/frontend-e2e/phases/phase_05_identity.rs#L15)). `tests/frontend-e2e/main.rs` then explicitly skips Phase 6 DPNS registration for that reason ([tests/frontend-e2e/main.rs#L41](tests/frontend-e2e/main.rs#L41)). So the landed phased suite currently covers platform reads and UI validation, but not the end-to-end identity creation and DPNS registration path implied by the PR description.

Comment on lines +16 to +110
/// Check if a wallet with the E2E alias already exists that we can reuse.
/// Only matches by exact alias -- never grabs unrelated wallets.
fn find_existing_e2e_wallet(harness: &Harness<'_, AppState>) -> Option<WalletSeedHash> {
let app_ctx = harness.state().current_app_context();
let wallets = app_ctx.wallets().read().unwrap();
wallets
.iter()
.find(|(_, wallet)| wallet.read().unwrap().alias.as_deref() == Some(E2E_WALLET_ALIAS))
.map(|(seed_hash, _)| *seed_hash)
}

/// Import a wallet through the AppContext/model layer (no UI screen setters).
/// Parses the mnemonic, creates a Wallet, and registers it via AppContext.
fn import_wallet_via_context(
harness: &mut Harness<'_, AppState>,
ctx: &mut TestContext,
words: &[&str],
) {
let initial_wallet_keys: BTreeSet<WalletSeedHash> = {
let app_ctx = harness.state().current_app_context();
app_ctx.wallets().read().unwrap().keys().copied().collect()
};

let phrase = words.join(" ");
let mnemonic = bip39::Mnemonic::parse_normalized(&phrase)
.unwrap_or_else(|e| panic!("Invalid mnemonic: {}", e));
let seed = mnemonic.to_seed("");

let app_ctx = harness.state().current_app_context().clone();
let wallet = Wallet::new_from_seed(
seed,
app_ctx.network(),
Some(E2E_WALLET_ALIAS.to_string()),
None,
)
.unwrap_or_else(|e| panic!("Wallet creation failed: {}", e));

println!(
" Parsed {} mnemonic words, alias '{}'",
words.len(),
E2E_WALLET_ALIAS
);

let (new_seed_hash, _wallet_arc) = app_ctx
.register_wallet(wallet)
.unwrap_or_else(|e| panic!("Wallet registration failed: {}", e));
println!(" Wallet registered via AppContext");

// Let the UI pick up the new wallet
harness.run_steps(SETTLE_STEPS);

// Verify wallet appeared
{
let wallets = app_ctx.wallets().read().unwrap();
assert!(
wallets.len() > initial_wallet_keys.len(),
"Wallet count didn't increase after registration (still {})",
initial_wallet_keys.len()
);
assert!(
wallets.contains_key(&new_seed_hash),
"Registered wallet not found in AppContext"
);
ctx.wallet_seed_hash = Some(new_seed_hash);
}
println!(
" Wallet imported. Seed hash prefix: {}",
seed_hash_prefix(ctx.seed_hash())
);
}

pub fn run(harness: &mut Harness<'_, AppState>, ctx: &mut TestContext) {
// 1. Read & validate mnemonic
let mnemonic =
std::env::var("E2E_WALLET_MNEMONIC").expect("E2E_WALLET_MNEMONIC env var required");
let words: Vec<&str> = mnemonic.split_whitespace().collect();
assert!(
[12, 15, 18, 21, 24].contains(&words.len()),
"Mnemonic has {} words, expected 12/15/18/21/24",
words.len()
);
println!(" Mnemonic: {} words", words.len());

// 2. Dismiss welcome screen
dismiss_welcome_screen(harness);
harness.run_steps(SETTLE_STEPS);

// 3. Switch to testnet and enable SPV mode
harness.state_mut().change_network(Network::Testnet);
harness.run_steps(SETTLE_STEPS);
let app_ctx = harness.state().current_app_context().clone();
app_ctx.set_core_backend_mode(CoreBackendMode::Spv);
println!(" Switched to testnet (SPV mode)");

// 4. Check if wallet already exists (idempotent re-run support)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: E2E wallet reuse is keyed only by alias, so the suite can adopt and later delete an unrelated wallet

find_existing_e2e_wallet() reuses the first wallet whose alias equals the fixed string "E2E Test Wallet"; it does not verify the mnemonic, seed hash, or any other test-owned marker before setting ctx.wallet_seed_hash and ctx.wallet_reused (tests/frontend-e2e/phases/phase_00_setup.rs#L16, tests/frontend-e2e/phases/phase_00_setup.rs#L110). Teardown then unconditionally removes whatever seed hash is stored in the context, and the panic-path emergency_cleanup() does the same, with no guard for ctx.wallet_reused (tests/frontend-e2e/phases/phase_07_teardown.rs#L47, tests/frontend-e2e/helpers/harness.rs#L414). That makes alias collisions destructive: a preexisting non-test wallet with that alias can be hijacked for the run and then deleted from the DB.

source: codex

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_00_setup.rs`:
- [SUGGESTION] lines 16-110: E2E wallet reuse is keyed only by alias, so the suite can adopt and later delete an unrelated wallet
  `find_existing_e2e_wallet()` reuses the first wallet whose `alias` equals the fixed string `"E2E Test Wallet"`; it does not verify the mnemonic, seed hash, or any other test-owned marker before setting `ctx.wallet_seed_hash` and `ctx.wallet_reused` ([tests/frontend-e2e/phases/phase_00_setup.rs#L16](tests/frontend-e2e/phases/phase_00_setup.rs#L16), [tests/frontend-e2e/phases/phase_00_setup.rs#L110](tests/frontend-e2e/phases/phase_00_setup.rs#L110)). Teardown then unconditionally removes whatever seed hash is stored in the context, and the panic-path `emergency_cleanup()` does the same, with no guard for `ctx.wallet_reused` ([tests/frontend-e2e/phases/phase_07_teardown.rs#L47](tests/frontend-e2e/phases/phase_07_teardown.rs#L47), [tests/frontend-e2e/helpers/harness.rs#L414](tests/frontend-e2e/helpers/harness.rs#L414)). That makes alias collisions destructive: a preexisting non-test wallet with that alias can be hijacked for the run and then deleted from the DB.

// Run validation sub-tests first — these are pure client-side (no network
// calls) and should run regardless of SPV sync state.
run_validation_tests(harness);

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: The phased E2E journey does not execute the identity/DPNS flow it claims to add coverage for

The main E2E entrypoint runs Phase 5, but phase_05_identity::run() explicitly skips actual identity creation and only performs client-side validation checks, so it never sets ctx.identity_id (tests/frontend-e2e/phases/phase_05_identity.rs#L15). tests/frontend-e2e/main.rs then explicitly skips Phase 6 DPNS registration for that reason (tests/frontend-e2e/main.rs#L41). So the landed phased suite currently covers platform reads and UI validation, but not the end-to-end identity creation and DPNS registration path implied by the PR description.

source: codex

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_05_identity.rs`:
- [SUGGESTION] line 15: The phased E2E journey does not execute the identity/DPNS flow it claims to add coverage for
  The main E2E entrypoint runs Phase 5, but `phase_05_identity::run()` explicitly skips actual identity creation and only performs client-side validation checks, so it never sets `ctx.identity_id` ([tests/frontend-e2e/phases/phase_05_identity.rs#L15](tests/frontend-e2e/phases/phase_05_identity.rs#L15)). `tests/frontend-e2e/main.rs` then explicitly skips Phase 6 DPNS registration for that reason ([tests/frontend-e2e/main.rs#L41](tests/frontend-e2e/main.rs#L41)). So the landed phased suite currently covers platform reads and UI validation, but not the end-to-end identity creation and DPNS registration path implied by the PR description.

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed both findings from the self-review (commit e90cc8b):

Finding 1 — Wallet reuse keyed only by alias (fixed in 1ce396e):
find_existing_e2e_wallet() now takes expected_seed_hash: &WalletSeedHash and verifies both alias AND seed hash before reusing. The expected hash is computed from the mnemonic early in run() (SHA-256 of BIP39 seed bytes via ClosedKeyItem::compute_seed_hash). A wallet with a matching alias but different seed — i.e., a preexisting non-test wallet — will no longer be adopted and subsequently deleted by teardown.

Finding 2 — Phase 5 scope overstated (addressed in 1ce396e):
Added a doc comment to phase_05_identity::run() explicitly stating that the phase runs client-side UI validation sub-tests only and does not perform actual identity creation. Updated the Phase 5 banner in main.rs to match: "Phase 5: Identity Screen Validation (creation skipped — pending SPV mempool)".

Branch: thepastaclaw/dash-evo-tool@tracker-922 (1 commit on top of e90cc8b)

@thepastaclaw
Copy link
Copy Markdown
Collaborator Author

Re-verified this for tracker #1157: the PR branch already has tests/e2e renamed to tests/frontend-e2e, Cargo now uses the frontend-e2e test target, and the docs/references point at tests/frontend-e2e/. The earlier requested consistency with tests/backend-e2e is still satisfied.

@thepastaclaw
Copy link
Copy Markdown
Collaborator Author

Rechecked again for tracker #1216: the requested rename is already present on the PR branch. tests/frontend-e2e/ and tests/backend-e2e/ both exist, Cargo uses the frontend-e2e test target at tests/frontend-e2e/main.rs, and I found no remaining tests/e2e, --test e2e, or name = "e2e" references in the checked docs/config/test tree. No branch changes needed.

@thepastaclaw
Copy link
Copy Markdown
Collaborator Author

Done — the UI harness files are now under tests/frontend-e2e/, matching the existing backend-e2e naming pattern. Thanks for catching that.

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Two confirmed blocking bugs (Phase 03 queries a non-existent 'Fetch Contracts' button, and the alias-based wallet reuse path can hijack and delete a developer's unrelated wallet) plus a blocking test-isolation gap (test targets lack required-features = ["testing"], so kittest can open the production DB and frontend-e2e fails to compile without --all-features). Several valid suggestions around encapsulation, retry classification, and CI coverage remain. Dropped a handful of nitpicks and one out-of-scope security finding (mnemonic Drop) that was introduced in a different PR.

Reviewed commit: e90cc8b

🔴 3 blocking | 🟡 4 suggestion(s) | 💬 2 nitpick(s)

1 additional finding

💬 nitpick: `identities` module visibility widened from `pub(crate)` to `pub` for tests

src/ui/mod.rs (line 86)

pub(crate) mod identities; was changed to pub mod identities; in this PR (commit 58451c0) so that tests/frontend-e2e/phases/phase_05_identity.rs and phase_06_dpns.rs can import dash_evo_tool::ui::identities::add_new_identity_screen::*. Other UI submodules (contracts_documents, dpns, tokens) are already pub, so this brings identities in line — but it now publishes the entire identities UI tree as part of the library's stable API. Prefer gating the wider visibility behind the testing feature (or re-exporting just the test-needed types via a narrow facade) so the production library surface stays minimal.

💡 Suggested change
#[cfg(any(test, feature = "testing"))]
pub mod identities;
#[cfg(not(any(test, feature = "testing")))]
pub(crate) mod identities;
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_03_platform.rs`:
- [BLOCKING] lines 119-123: Phase 03 queries non-existent 'Fetch Contracts' button
  `run_contract_fetch` looks up a button labelled `Fetch Contracts`, but in `src/ui/contracts_documents/add_contracts_screen.rs` (lines 317, 329, 339 at this SHA) the breadcrumb, heading, and primary button are all labelled `Add Contracts`. The `.expect(...)` call has no fallback path, so every E2E run will panic here. Either rename the button in source (the disambiguation reasoning still applies — breadcrumb and primary button currently share the same label) or change the test to query `Add Contracts` and disambiguate by index/role the way Phase 5/6 do for breadcrumbs.

In `tests/frontend-e2e/phases/phase_00_setup.rs`:
- [BLOCKING] lines 16-24: Frontend E2E reruns can hijack and delete an unrelated wallet by alias
  `find_existing_e2e_wallet()` matches any wallet whose alias is exactly `"E2E Test Wallet"` and treats it as reusable test state (line 23). That alias is human-editable, not a unique identifier — a developer can already have a real wallet aliased that way. The reuse path then calls `open_no_password()` on it (line 134), and `phase_07_teardown.rs:47-52` unconditionally calls `app_ctx.remove_wallet(seed_hash)` regardless of `ctx.wallet_reused`, which destroys the developer's wallet entry. Match on a deterministic fingerprint derived from `E2E_WALLET_MNEMONIC` (e.g. the seed hash itself) instead of an alias, and skip wallet removal in teardown when `ctx.wallet_reused` is set to make the path symmetrical with the documented "reuse-across-runs" intent.

In `Cargo.toml`:
- [BLOCKING] lines 118-121: Test targets are not gated behind the `testing` feature, breaking isolation and compilation
  `AppState::new` switches to an in-memory database only when `feature = "testing"` is enabled (`src/app.rs:200-217`). The `frontend-e2e` test target was declared without `required-features = ["testing"]` (commit e90cc8be removed the previous `e2e` gate), and the auto-discovered `kittest` suite has no manifest entry at all. Two consequences: (1) `cargo test --test kittest` (without `--features testing`) calls the production constructor and opens the real `~/Library/Application Support/Dash-Evo-Tool/data.db`, breaking the isolation invariant the constructor split was added to enforce. (2) `cargo test --test frontend-e2e` (without `--features testing` / `--all-features`) fails to compile because `phase_07_teardown.rs:39` and `helpers/harness.rs:407` use `AppContext::db()` / `wallets()` accessors gated behind `cfg(any(test, feature = "testing"))` (`src/context/mod.rs:780`). CI uses `--all-features` so it is unaffected, but local invocations are dangerous.

In `src/ui/identities/add_new_identity_screen/mod.rs`:
- [SUGGESTION] lines 109-129: Test-only accessors are unconditionally `pub` on production builds
  Commit e90cc8be removed the `#[cfg(feature = "e2e")]` gate on `AddNewIdentityScreen::step()`, `funding_method()`, `set_funding_amount()`, and `set_alias_input()`, leaving them as unconditionally `pub` fns whose only consumer is `tests/frontend-e2e/phases/phase_05_identity.rs`. `step()` and `funding_method()` clone and return `Arc<RwLock<...>>` handles to internal state machines, so any external caller can mutate the screen's lifecycle and break its invariants — a direct violation of CLAUDE.md's component pattern ("Private fields only", "Anti-patterns: public mutable fields"). The same issue applies to `RegisterDpnsNameScreen::set_name_input` at `src/ui/identities/register_dpns_name_screen.rs:67-72`. Gate these behind `cfg(any(test, feature = "testing"))` to match the existing pattern used for `AppContext::db()` / `wallets()` at `src/context/mod.rs:780`.

In `tests/frontend-e2e/main.rs`:
- [SUGGESTION] lines 10-12: Frontend E2E suite is `#[ignore]` and never executed by CI
  The only `#[test]` in `frontend-e2e` is marked `#[ignore]`, so `cargo test --all-features --workspace` in `.github/workflows/tests.yml:72-76` skips it. There is also no dedicated workflow step running `cargo test --test frontend-e2e -- --ignored` — only commented-out legacy backend-E2E commands at lines 85/90. The PR adds a substantial harness whose phase logic, AccessKit selectors, and cleanup flow will silently regress because nothing exercises them in CI. Add a CI job (with the testnet mnemonic provided as a secret) that runs the suite via `--ignored`, even if only on a schedule.

In `tests/frontend-e2e/helpers/harness.rs`:
- [SUGGESTION] lines 61-108: Error classification priority misclassifies retryable platform errors as fatal
  `classify_error` walks `ERROR_PATTERNS` in order and returns on the first matching substring. Network is checked first (correctly biasing toward retry), but Validation is then checked before TransientPlatform. Errors that contain *both* tokens — e.g. `"consensus invalid"`, `"transport error: invalid response"`, `"internal error: not found"` — match Validation's `"invalid"` / `"not found"` first and are classified `Fatal`, even though the more-specific TransientPlatform token (`consensus`, `transport error`, `internal error`) makes them retryable. Reorder so `TransientPlatform` precedes `Validation`. (Longer-term, surface a typed error category from the screen so tests don't sniff label text — CLAUDE.md prohibits string parsing for error classification — but at minimum fix the priority order.)

In `tests/frontend-e2e/phases/phase_04_tokens.rs`:
- [SUGGESTION] lines 11-35: Token search input is not cleared between retries — query accumulates
  `navigate_to_screen()` clears the screen stack but the `TokensScreen` instance for `RootScreenTokenSearch` persists, and `TokensScreen::refresh_on_arrival()` (`src/ui/tokens/tokens_screen/mod.rs:2790`) does not reset `token_search_query`. `type_into_text_input()` then clicks and calls `Node::type_text()`, which fires an `egui::Event::Text` at the current cursor — appending rather than replacing. On retry the input becomes `"dashdash"` (then `"dashdashdash"`), changing the query semantics on every retry. The retry path can then panic on line 84 (`"empty results means the query mechanism or token contract is broken"`) for a reason unrelated to the original failure. Either reset `token_search_query` in `refresh_on_arrival` or clear the field programmatically in the helper before typing.

Comment on lines +119 to +123
harness
.query_by_label("Fetch Contracts")
.or_else(|| harness.query_by_label_contains("Fetch Contracts"))
.expect("'Fetch Contracts' button must be visible on Add Contracts screen")
.click();
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking: Phase 03 queries non-existent 'Fetch Contracts' button

run_contract_fetch looks up a button labelled Fetch Contracts, but in src/ui/contracts_documents/add_contracts_screen.rs (lines 317, 329, 339 at this SHA) the breadcrumb, heading, and primary button are all labelled Add Contracts. The .expect(...) call has no fallback path, so every E2E run will panic here. Either rename the button in source (the disambiguation reasoning still applies — breadcrumb and primary button currently share the same label) or change the test to query Add Contracts and disambiguate by index/role the way Phase 5/6 do for breadcrumbs.

source: ['claude']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_03_platform.rs`:
- [BLOCKING] lines 119-123: Phase 03 queries non-existent 'Fetch Contracts' button
  `run_contract_fetch` looks up a button labelled `Fetch Contracts`, but in `src/ui/contracts_documents/add_contracts_screen.rs` (lines 317, 329, 339 at this SHA) the breadcrumb, heading, and primary button are all labelled `Add Contracts`. The `.expect(...)` call has no fallback path, so every E2E run will panic here. Either rename the button in source (the disambiguation reasoning still applies — breadcrumb and primary button currently share the same label) or change the test to query `Add Contracts` and disambiguate by index/role the way Phase 5/6 do for breadcrumbs.

Comment on lines +16 to +24
/// Check if a wallet with the E2E alias already exists that we can reuse.
/// Only matches by exact alias -- never grabs unrelated wallets.
fn find_existing_e2e_wallet(harness: &Harness<'_, AppState>) -> Option<WalletSeedHash> {
let app_ctx = harness.state().current_app_context();
let wallets = app_ctx.wallets().read().unwrap();
wallets
.iter()
.find(|(_, wallet)| wallet.read().unwrap().alias.as_deref() == Some(E2E_WALLET_ALIAS))
.map(|(seed_hash, _)| *seed_hash)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking: Frontend E2E reruns can hijack and delete an unrelated wallet by alias

find_existing_e2e_wallet() matches any wallet whose alias is exactly "E2E Test Wallet" and treats it as reusable test state (line 23). That alias is human-editable, not a unique identifier — a developer can already have a real wallet aliased that way. The reuse path then calls open_no_password() on it (line 134), and phase_07_teardown.rs:47-52 unconditionally calls app_ctx.remove_wallet(seed_hash) regardless of ctx.wallet_reused, which destroys the developer's wallet entry. Match on a deterministic fingerprint derived from E2E_WALLET_MNEMONIC (e.g. the seed hash itself) instead of an alias, and skip wallet removal in teardown when ctx.wallet_reused is set to make the path symmetrical with the documented "reuse-across-runs" intent.

source: ['codex', 'claude']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_00_setup.rs`:
- [BLOCKING] lines 16-24: Frontend E2E reruns can hijack and delete an unrelated wallet by alias
  `find_existing_e2e_wallet()` matches any wallet whose alias is exactly `"E2E Test Wallet"` and treats it as reusable test state (line 23). That alias is human-editable, not a unique identifier — a developer can already have a real wallet aliased that way. The reuse path then calls `open_no_password()` on it (line 134), and `phase_07_teardown.rs:47-52` unconditionally calls `app_ctx.remove_wallet(seed_hash)` regardless of `ctx.wallet_reused`, which destroys the developer's wallet entry. Match on a deterministic fingerprint derived from `E2E_WALLET_MNEMONIC` (e.g. the seed hash itself) instead of an alias, and skip wallet removal in teardown when `ctx.wallet_reused` is set to make the path symmetrical with the documented "reuse-across-runs" intent.

Comment thread Cargo.toml
Comment on lines +118 to +121
[[test]]
name = "frontend-e2e"
path = "tests/frontend-e2e/main.rs"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking: Test targets are not gated behind the testing feature, breaking isolation and compilation

AppState::new switches to an in-memory database only when feature = "testing" is enabled (src/app.rs:200-217). The frontend-e2e test target was declared without required-features = ["testing"] (commit e90cc8b removed the previous e2e gate), and the auto-discovered kittest suite has no manifest entry at all. Two consequences: (1) cargo test --test kittest (without --features testing) calls the production constructor and opens the real ~/Library/Application Support/Dash-Evo-Tool/data.db, breaking the isolation invariant the constructor split was added to enforce. (2) cargo test --test frontend-e2e (without --features testing / --all-features) fails to compile because phase_07_teardown.rs:39 and helpers/harness.rs:407 use AppContext::db() / wallets() accessors gated behind cfg(any(test, feature = "testing")) (src/context/mod.rs:780). CI uses --all-features so it is unaffected, but local invocations are dangerous.

💡 Suggested change
Suggested change
[[test]]
name = "frontend-e2e"
path = "tests/frontend-e2e/main.rs"
[[test]]
name = "frontend-e2e"
path = "tests/frontend-e2e/main.rs"
required-features = ["testing"]
[[test]]
name = "kittest"
path = "tests/kittest/main.rs"
required-features = ["testing"]

source: ['codex', 'claude']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `Cargo.toml`:
- [BLOCKING] lines 118-121: Test targets are not gated behind the `testing` feature, breaking isolation and compilation
  `AppState::new` switches to an in-memory database only when `feature = "testing"` is enabled (`src/app.rs:200-217`). The `frontend-e2e` test target was declared without `required-features = ["testing"]` (commit e90cc8be removed the previous `e2e` gate), and the auto-discovered `kittest` suite has no manifest entry at all. Two consequences: (1) `cargo test --test kittest` (without `--features testing`) calls the production constructor and opens the real `~/Library/Application Support/Dash-Evo-Tool/data.db`, breaking the isolation invariant the constructor split was added to enforce. (2) `cargo test --test frontend-e2e` (without `--features testing` / `--all-features`) fails to compile because `phase_07_teardown.rs:39` and `helpers/harness.rs:407` use `AppContext::db()` / `wallets()` accessors gated behind `cfg(any(test, feature = "testing"))` (`src/context/mod.rs:780`). CI uses `--all-features` so it is unaffected, but local invocations are dangerous.

Comment on lines +109 to +129
impl AddNewIdentityScreen {
/// Returns a reference to the step state.
pub fn step(&self) -> Arc<RwLock<WalletFundedScreenStep>> {
self.step.clone()
}

/// Returns a reference to the funding method state.
pub fn funding_method(&self) -> Arc<RwLock<FundingMethod>> {
self.funding_method.clone()
}

/// Sets the funding amount.
pub fn set_funding_amount(&mut self, amount: Option<Amount>) {
self.funding_amount = amount;
}

/// Sets the alias input.
pub fn set_alias_input(&mut self, alias: String) {
self.alias_input = alias;
}
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Test-only accessors are unconditionally pub on production builds

Commit e90cc8b removed the #[cfg(feature = "e2e")] gate on AddNewIdentityScreen::step(), funding_method(), set_funding_amount(), and set_alias_input(), leaving them as unconditionally pub fns whose only consumer is tests/frontend-e2e/phases/phase_05_identity.rs. step() and funding_method() clone and return Arc<RwLock<...>> handles to internal state machines, so any external caller can mutate the screen's lifecycle and break its invariants — a direct violation of CLAUDE.md's component pattern ("Private fields only", "Anti-patterns: public mutable fields"). The same issue applies to RegisterDpnsNameScreen::set_name_input at src/ui/identities/register_dpns_name_screen.rs:67-72. Gate these behind cfg(any(test, feature = "testing")) to match the existing pattern used for AppContext::db() / wallets() at src/context/mod.rs:780.

💡 Suggested change
Suggested change
impl AddNewIdentityScreen {
/// Returns a reference to the step state.
pub fn step(&self) -> Arc<RwLock<WalletFundedScreenStep>> {
self.step.clone()
}
/// Returns a reference to the funding method state.
pub fn funding_method(&self) -> Arc<RwLock<FundingMethod>> {
self.funding_method.clone()
}
/// Sets the funding amount.
pub fn set_funding_amount(&mut self, amount: Option<Amount>) {
self.funding_amount = amount;
}
/// Sets the alias input.
pub fn set_alias_input(&mut self, alias: String) {
self.alias_input = alias;
}
}
#[cfg(any(test, feature = "testing"))]
impl AddNewIdentityScreen {
/// Returns a reference to the step state.
pub fn step(&self) -> Arc<RwLock<WalletFundedScreenStep>> {
self.step.clone()
}
/// Returns a reference to the funding method state.
pub fn funding_method(&self) -> Arc<RwLock<FundingMethod>> {
self.funding_method.clone()
}
/// Sets the funding amount.
pub fn set_funding_amount(&mut self, amount: Option<Amount>) {
self.funding_amount = amount;
}
/// Sets the alias input.
pub fn set_alias_input(&mut self, alias: String) {
self.alias_input = alias;
}
}

source: ['claude']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/ui/identities/add_new_identity_screen/mod.rs`:
- [SUGGESTION] lines 109-129: Test-only accessors are unconditionally `pub` on production builds
  Commit e90cc8be removed the `#[cfg(feature = "e2e")]` gate on `AddNewIdentityScreen::step()`, `funding_method()`, `set_funding_amount()`, and `set_alias_input()`, leaving them as unconditionally `pub` fns whose only consumer is `tests/frontend-e2e/phases/phase_05_identity.rs`. `step()` and `funding_method()` clone and return `Arc<RwLock<...>>` handles to internal state machines, so any external caller can mutate the screen's lifecycle and break its invariants — a direct violation of CLAUDE.md's component pattern ("Private fields only", "Anti-patterns: public mutable fields"). The same issue applies to `RegisterDpnsNameScreen::set_name_input` at `src/ui/identities/register_dpns_name_screen.rs:67-72`. Gate these behind `cfg(any(test, feature = "testing"))` to match the existing pattern used for `AppContext::db()` / `wallets()` at `src/context/mod.rs:780`.

Comment on lines +10 to +12
#[test]
#[ignore]
fn e2e_testnet_journey() {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Frontend E2E suite is #[ignore] and never executed by CI

The only #[test] in frontend-e2e is marked #[ignore], so cargo test --all-features --workspace in .github/workflows/tests.yml:72-76 skips it. There is also no dedicated workflow step running cargo test --test frontend-e2e -- --ignored — only commented-out legacy backend-E2E commands at lines 85/90. The PR adds a substantial harness whose phase logic, AccessKit selectors, and cleanup flow will silently regress because nothing exercises them in CI. Add a CI job (with the testnet mnemonic provided as a secret) that runs the suite via --ignored, even if only on a schedule.

source: ['codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/main.rs`:
- [SUGGESTION] lines 10-12: Frontend E2E suite is `#[ignore]` and never executed by CI
  The only `#[test]` in `frontend-e2e` is marked `#[ignore]`, so `cargo test --all-features --workspace` in `.github/workflows/tests.yml:72-76` skips it. There is also no dedicated workflow step running `cargo test --test frontend-e2e -- --ignored` — only commented-out legacy backend-E2E commands at lines 85/90. The PR adds a substantial harness whose phase logic, AccessKit selectors, and cleanup flow will silently regress because nothing exercises them in CI. Add a CI job (with the testnet mnemonic provided as a secret) that runs the suite via `--ignored`, even if only on a schedule.

Comment thread tests/frontend-e2e/helpers/harness.rs Outdated
Comment on lines +61 to +108
const ERROR_PATTERNS: &[(ErrorCategory, &[&str])] = &[
(
ErrorCategory::Network,
&[
"timeout",
"connection",
"network",
"unavailable",
"timed out",
"refused",
"unreachable",
],
),
(
ErrorCategory::Validation,
&[
"invalid",
"insufficient",
"already exists",
"not found",
"duplicate",
"too low",
"too high",
],
),
(
ErrorCategory::TransientPlatform,
&[
"consensus",
"retry",
"temporarily",
"try again",
"rate limit",
"internal error",
"transport error",
],
),
];

pub fn classify_error(error_text: &str) -> ErrorCategory {
let lower = error_text.to_lowercase();
for (category, patterns) in ERROR_PATTERNS {
if patterns.iter().any(|p| lower.contains(p)) {
return *category;
}
}
ErrorCategory::Fatal
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Error classification priority misclassifies retryable platform errors as fatal

classify_error walks ERROR_PATTERNS in order and returns on the first matching substring. Network is checked first (correctly biasing toward retry), but Validation is then checked before TransientPlatform. Errors that contain both tokens — e.g. "consensus invalid", "transport error: invalid response", "internal error: not found" — match Validation's "invalid" / "not found" first and are classified Fatal, even though the more-specific TransientPlatform token (consensus, transport error, internal error) makes them retryable. Reorder so TransientPlatform precedes Validation. (Longer-term, surface a typed error category from the screen so tests don't sniff label text — CLAUDE.md prohibits string parsing for error classification — but at minimum fix the priority order.)

💡 Suggested change
Suggested change
const ERROR_PATTERNS: &[(ErrorCategory, &[&str])] = &[
(
ErrorCategory::Network,
&[
"timeout",
"connection",
"network",
"unavailable",
"timed out",
"refused",
"unreachable",
],
),
(
ErrorCategory::Validation,
&[
"invalid",
"insufficient",
"already exists",
"not found",
"duplicate",
"too low",
"too high",
],
),
(
ErrorCategory::TransientPlatform,
&[
"consensus",
"retry",
"temporarily",
"try again",
"rate limit",
"internal error",
"transport error",
],
),
];
pub fn classify_error(error_text: &str) -> ErrorCategory {
let lower = error_text.to_lowercase();
for (category, patterns) in ERROR_PATTERNS {
if patterns.iter().any(|p| lower.contains(p)) {
return *category;
}
}
ErrorCategory::Fatal
}
const ERROR_PATTERNS: &[(ErrorCategory, &[&str])] = &[
(
ErrorCategory::Network,
&[
"timeout",
"connection",
"network",
"unavailable",
"timed out",
"refused",
"unreachable",
],
),
(
ErrorCategory::TransientPlatform,
&[
"consensus",
"retry",
"temporarily",
"try again",
"rate limit",
"internal error",
"transport error",
],
),
(
ErrorCategory::Validation,
&[
"invalid",
"insufficient",
"already exists",
"not found",
"duplicate",
"too low",
"too high",
],
),
];

source: ['claude']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/helpers/harness.rs`:
- [SUGGESTION] lines 61-108: Error classification priority misclassifies retryable platform errors as fatal
  `classify_error` walks `ERROR_PATTERNS` in order and returns on the first matching substring. Network is checked first (correctly biasing toward retry), but Validation is then checked before TransientPlatform. Errors that contain *both* tokens — e.g. `"consensus invalid"`, `"transport error: invalid response"`, `"internal error: not found"` — match Validation's `"invalid"` / `"not found"` first and are classified `Fatal`, even though the more-specific TransientPlatform token (`consensus`, `transport error`, `internal error`) makes them retryable. Reorder so `TransientPlatform` precedes `Validation`. (Longer-term, surface a typed error category from the screen so tests don't sniff label text — CLAUDE.md prohibits string parsing for error classification — but at minimum fix the priority order.)

Comment on lines +11 to +35
navigate_to_screen(harness, RootScreenType::RootScreenTokenSearch);

let on_screen =
wait_for_label(harness, "Enter Keyword", std::time::Duration::from_secs(10));
assert!(
on_screen,
"'Enter Keyword' label must be visible on token search screen"
);

if attempt == 1 {
println!(" Navigated to token search screen");
}

type_into_text_input(harness, 0, "dash");
println!(
" Typed 'dash' in search input (attempt {}/{})",
attempt, PLATFORM_MAX_RETRIES
);

// ─── 2. Click search button and wait for results ───────────────
harness
.query_by_label("Search")
.expect("Search button must be visible on token search screen")
.click();
harness.run_steps(SETTLE_STEPS);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Token search input is not cleared between retries — query accumulates

navigate_to_screen() clears the screen stack but the TokensScreen instance for RootScreenTokenSearch persists, and TokensScreen::refresh_on_arrival() (src/ui/tokens/tokens_screen/mod.rs:2790) does not reset token_search_query. type_into_text_input() then clicks and calls Node::type_text(), which fires an egui::Event::Text at the current cursor — appending rather than replacing. On retry the input becomes "dashdash" (then "dashdashdash"), changing the query semantics on every retry. The retry path can then panic on line 84 ("empty results means the query mechanism or token contract is broken") for a reason unrelated to the original failure. Either reset token_search_query in refresh_on_arrival or clear the field programmatically in the helper before typing.

source: ['claude']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_04_tokens.rs`:
- [SUGGESTION] lines 11-35: Token search input is not cleared between retries — query accumulates
  `navigate_to_screen()` clears the screen stack but the `TokensScreen` instance for `RootScreenTokenSearch` persists, and `TokensScreen::refresh_on_arrival()` (`src/ui/tokens/tokens_screen/mod.rs:2790`) does not reset `token_search_query`. `type_into_text_input()` then clicks and calls `Node::type_text()`, which fires an `egui::Event::Text` at the current cursor — appending rather than replacing. On retry the input becomes `"dashdash"` (then `"dashdashdash"`), changing the query semantics on every retry. The retry path can then panic on line 84 (`"empty results means the query mechanism or token contract is broken"`) for a reason unrelated to the original failure. Either reset `token_search_query` in `refresh_on_arrival` or clear the field programmatically in the helper before typing.

Comment thread tests/frontend-e2e/helpers/harness.rs Outdated
Comment on lines +60 to +108
/// rather than Validation (fatal). This biases toward retry for E2E resilience.
const ERROR_PATTERNS: &[(ErrorCategory, &[&str])] = &[
(
ErrorCategory::Network,
&[
"timeout",
"connection",
"network",
"unavailable",
"timed out",
"refused",
"unreachable",
],
),
(
ErrorCategory::Validation,
&[
"invalid",
"insufficient",
"already exists",
"not found",
"duplicate",
"too low",
"too high",
],
),
(
ErrorCategory::TransientPlatform,
&[
"consensus",
"retry",
"temporarily",
"try again",
"rate limit",
"internal error",
"transport error",
],
),
];

pub fn classify_error(error_text: &str) -> ErrorCategory {
let lower = error_text.to_lowercase();
for (category, patterns) in ERROR_PATTERNS {
if patterns.iter().any(|p| lower.contains(p)) {
return *category;
}
}
ErrorCategory::Fatal
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Error classification depends on substring matching of user-facing strings

ERROR_PATTERNS matches lowercased UI labels against fragile substrings like "timeout", "invalid", "consensus". CLAUDE.md states for production code: "Never parse error strings to extract information." The harness's retry decisions and pass/fail behavior depend on this classification, which silently breaks if a TaskError Display string is reworded. Route classification through the actual TaskError variant — e.g. snapshot the last TaskError set on the global banner via a test-only accessor, or expose the structured error type from the screen's display_task_result(). A similar concern applies to the broad query_by_label_contains("Error") / "Dismiss" predicates used as state checks in phase_03_platform.rs:46-54, phase_04_tokens.rs:42, and phase_06_dpns.rs:99 — they can resolve completion off ambient/stale UI banners.

source: ['claude', 'codex']

@thepastaclaw
Copy link
Copy Markdown
Collaborator Author

Addressed the surviving automated review follow-ups in 874c8a6:\n\n- fixed the Phase 03 Add Contracts selector\n- made E2E wallet reuse seed-hash based and preserved reused wallets in teardown\n- gated frontend-e2e/kittest targets behind the testing feature\n- gated test-only UI accessors behind testing/test cfg\n- added a secret-gated CI step for ignored frontend E2E tests\n- reordered transient platform error classification before validation\n- cleared token search before retry typing\n\nValidation run locally:\n- cargo fmt --all\n- git diff --check\n- cargo check --features testing --test frontend-e2e --test kittest\n- cargo check

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Three real follow-up issues remain after the prior round was largely resolved: phase 03's nth(1) lookup for the 'Add Contracts' submit button likely lands on the ui.heading("Add Contracts") Label instead of the button (kittest's By::matches matches Label-role nodes by value(), and the standalone heading is not filtered via labelled_by); emergency_cleanup still wipes a reused wallet on panic, contradicting the new wallet_reused invariant honored by the normal teardown; and pub mod identities was widened solely to satisfy tests. The substring-based error classifier is a soft concern carried over.

Reviewed commit: 874c8a6

🔴 1 blocking | 🟡 2 suggestion(s) | 💬 1 nitpick(s)

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_03_platform.rs`:
- [BLOCKING] lines 118-123: `nth(1)` on `query_all_by_label("Add Contracts")` targets the heading Label, not the submit button
  `AddContractsScreen` produces three widgets named "Add Contracts": the breadcrumb button at `add_contracts_screen.rs:317` (rendered unconditionally as `ui.button(...)` by `add_location_view` in `top_panel.rs:71-81`), the `ui.heading("Add Contracts")` Label at `:329`, and the primary submit Button at `:339`. kittest's `By::matches` (`kittest-0.3.0/src/filter.rs:147-153`) explicitly reads `node.value()` when the role is `Label` so heading text is matched; `should_filter_labels` only excludes nodes referenced as `labelled_by` from another node, and a standalone heading is not. In tree order that makes index 0 = breadcrumb button, index 1 = heading Label, index 2 = primary submit Button — so `.nth(1).click()` clicks the heading, never invokes `add_contracts_clicked()`, and the phase will hit `wait_until` timeout on the final retry. Constrain the query by role (e.g. via a button-role filter) or rename the primary button to a unique label and match it exactly.

In `tests/frontend-e2e/helpers/harness.rs`:
- [SUGGESTION] lines 399-420: `emergency_cleanup` deletes reused wallets on panic, breaking the cleanup symmetry
  `phase_07_teardown.rs:47-54` correctly preserves the wallet when `ctx.wallet_reused` is true, but `emergency_cleanup` (invoked from `catch_unwind` on panic) still unconditionally calls `app_ctx.remove_wallet(seed_hash)` whenever `ctx.wallet_seed_hash` is set. A transient failure in any post-setup phase therefore destroys the deterministic, pre-seeded test wallet that the seed-hash reuse path was designed to keep. Mirror the teardown branch by guarding on `ctx.wallet_reused` so success and panic paths leave the same durable state.

In `src/ui/mod.rs`:
- [SUGGESTION] line 90: `identities` module visibility widened to `pub` solely for tests
  `identities` was changed from `pub(crate)` to `pub` so the frontend E2E suite can name `dash_evo_tool::ui::identities::*` types. The screen-level test accessors are correctly gated behind `cfg(any(test, feature = "testing"))`, but the module path is unconditionally public — making the entire identities UI tree part of the crate's stable API surface. The sibling `wallets` module remains `pub(crate)` (line 95). Gate the wider visibility behind the same `testing` feature so the production surface stays narrow.

Comment on lines +118 to +123
// Click the primary submit button, skipping the breadcrumb with the same label.
harness
.query_all_by_label("Add Contracts")
.nth(1)
.expect("'Add Contracts' submit button must be visible on Add Contracts screen")
.click();
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking: nth(1) on query_all_by_label("Add Contracts") targets the heading Label, not the submit button

AddContractsScreen produces three widgets named "Add Contracts": the breadcrumb button at add_contracts_screen.rs:317 (rendered unconditionally as ui.button(...) by add_location_view in top_panel.rs:71-81), the ui.heading("Add Contracts") Label at :329, and the primary submit Button at :339. kittest's By::matches (kittest-0.3.0/src/filter.rs:147-153) explicitly reads node.value() when the role is Label so heading text is matched; should_filter_labels only excludes nodes referenced as labelled_by from another node, and a standalone heading is not. In tree order that makes index 0 = breadcrumb button, index 1 = heading Label, index 2 = primary submit Button — so .nth(1).click() clicks the heading, never invokes add_contracts_clicked(), and the phase will hit wait_until timeout on the final retry. Constrain the query by role (e.g. via a button-role filter) or rename the primary button to a unique label and match it exactly.

source: ['claude']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_03_platform.rs`:
- [BLOCKING] lines 118-123: `nth(1)` on `query_all_by_label("Add Contracts")` targets the heading Label, not the submit button
  `AddContractsScreen` produces three widgets named "Add Contracts": the breadcrumb button at `add_contracts_screen.rs:317` (rendered unconditionally as `ui.button(...)` by `add_location_view` in `top_panel.rs:71-81`), the `ui.heading("Add Contracts")` Label at `:329`, and the primary submit Button at `:339`. kittest's `By::matches` (`kittest-0.3.0/src/filter.rs:147-153`) explicitly reads `node.value()` when the role is `Label` so heading text is matched; `should_filter_labels` only excludes nodes referenced as `labelled_by` from another node, and a standalone heading is not. In tree order that makes index 0 = breadcrumb button, index 1 = heading Label, index 2 = primary submit Button — so `.nth(1).click()` clicks the heading, never invokes `add_contracts_clicked()`, and the phase will hit `wait_until` timeout on the final retry. Constrain the query by role (e.g. via a button-role filter) or rename the primary button to a unique label and match it exactly.

Comment on lines +399 to +420
pub fn emergency_cleanup(harness: &Harness<'_, AppState>, ctx: &TestContext) {
let app_ctx = harness.state().current_app_context();
app_ctx.spv_manager().stop();
eprintln!(" Emergency: SPV stop requested");

if let Some(identity_id) = &ctx.identity_id {
match app_ctx
.db()
.delete_local_qualified_identity(identity_id, app_ctx)
{
Ok(()) => eprintln!(" Emergency: identity removed"),
Err(e) => eprintln!(" Emergency: identity removal failed: {}", e),
}
}

if let Some(seed_hash) = &ctx.wallet_seed_hash {
match app_ctx.remove_wallet(seed_hash) {
Ok(()) => eprintln!(" Emergency: wallet removed"),
Err(e) => eprintln!(" Emergency: wallet removal failed: {}", e),
}
}
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: emergency_cleanup deletes reused wallets on panic, breaking the cleanup symmetry

phase_07_teardown.rs:47-54 correctly preserves the wallet when ctx.wallet_reused is true, but emergency_cleanup (invoked from catch_unwind on panic) still unconditionally calls app_ctx.remove_wallet(seed_hash) whenever ctx.wallet_seed_hash is set. A transient failure in any post-setup phase therefore destroys the deterministic, pre-seeded test wallet that the seed-hash reuse path was designed to keep. Mirror the teardown branch by guarding on ctx.wallet_reused so success and panic paths leave the same durable state.

💡 Suggested change
Suggested change
pub fn emergency_cleanup(harness: &Harness<'_, AppState>, ctx: &TestContext) {
let app_ctx = harness.state().current_app_context();
app_ctx.spv_manager().stop();
eprintln!(" Emergency: SPV stop requested");
if let Some(identity_id) = &ctx.identity_id {
match app_ctx
.db()
.delete_local_qualified_identity(identity_id, app_ctx)
{
Ok(()) => eprintln!(" Emergency: identity removed"),
Err(e) => eprintln!(" Emergency: identity removal failed: {}", e),
}
}
if let Some(seed_hash) = &ctx.wallet_seed_hash {
match app_ctx.remove_wallet(seed_hash) {
Ok(()) => eprintln!(" Emergency: wallet removed"),
Err(e) => eprintln!(" Emergency: wallet removal failed: {}", e),
}
}
}
if ctx.wallet_reused {
eprintln!(" Emergency: reused wallet preserved");
} else if let Some(seed_hash) = &ctx.wallet_seed_hash {
match app_ctx.remove_wallet(seed_hash) {
Ok(()) => eprintln!(" Emergency: wallet removed"),
Err(e) => eprintln!(" Emergency: wallet removal failed: {}", e),
}
}

source: ['claude', 'codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/helpers/harness.rs`:
- [SUGGESTION] lines 399-420: `emergency_cleanup` deletes reused wallets on panic, breaking the cleanup symmetry
  `phase_07_teardown.rs:47-54` correctly preserves the wallet when `ctx.wallet_reused` is true, but `emergency_cleanup` (invoked from `catch_unwind` on panic) still unconditionally calls `app_ctx.remove_wallet(seed_hash)` whenever `ctx.wallet_seed_hash` is set. A transient failure in any post-setup phase therefore destroys the deterministic, pre-seeded test wallet that the seed-hash reuse path was designed to keep. Mirror the teardown branch by guarding on `ctx.wallet_reused` so success and panic paths leave the same durable state.

Comment thread src/ui/mod.rs
pub mod dpns;
pub mod helpers;
pub(crate) mod identities;
pub mod identities;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: identities module visibility widened to pub solely for tests

identities was changed from pub(crate) to pub so the frontend E2E suite can name dash_evo_tool::ui::identities::* types. The screen-level test accessors are correctly gated behind cfg(any(test, feature = "testing")), but the module path is unconditionally public — making the entire identities UI tree part of the crate's stable API surface. The sibling wallets module remains pub(crate) (line 95). Gate the wider visibility behind the same testing feature so the production surface stays narrow.

💡 Suggested change
Suggested change
pub mod identities;
#[cfg(any(test, feature = "testing"))]
pub mod identities;
#[cfg(not(any(test, feature = "testing")))]
pub(crate) mod identities;

source: ['claude', 'codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/ui/mod.rs`:
- [SUGGESTION] line 90: `identities` module visibility widened to `pub` solely for tests
  `identities` was changed from `pub(crate)` to `pub` so the frontend E2E suite can name `dash_evo_tool::ui::identities::*` types. The screen-level test accessors are correctly gated behind `cfg(any(test, feature = "testing"))`, but the module path is unconditionally public — making the entire identities UI tree part of the crate's stable API surface. The sibling `wallets` module remains `pub(crate)` (line 95). Gate the wider visibility behind the same `testing` feature so the production surface stays narrow.

Comment thread tests/frontend-e2e/helpers/harness.rs Outdated
Comment on lines +61 to +108
const ERROR_PATTERNS: &[(ErrorCategory, &[&str])] = &[
(
ErrorCategory::Network,
&[
"timeout",
"connection",
"network",
"unavailable",
"timed out",
"refused",
"unreachable",
],
),
(
ErrorCategory::TransientPlatform,
&[
"consensus",
"retry",
"temporarily",
"try again",
"rate limit",
"internal error",
"transport error",
],
),
(
ErrorCategory::Validation,
&[
"invalid",
"insufficient",
"already exists",
"not found",
"duplicate",
"too low",
"too high",
],
),
];

pub fn classify_error(error_text: &str) -> ErrorCategory {
let lower = error_text.to_lowercase();
for (category, patterns) in ERROR_PATTERNS {
if patterns.iter().any(|p| lower.contains(p)) {
return *category;
}
}
ErrorCategory::Fatal
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Error classification still relies on substring matching of user-facing strings

The TransientPlatform-before-Validation reorder fixes the priority bug, but classification still substring-matches lowercased UI label text against tokens like "timeout", "invalid", "consensus". Any reword of a TaskError's Display text will silently change retry decisions and pass/fail outcomes — exactly the brittleness CLAUDE.md's "never parse error strings" rule targets in production code. Longer term, expose the last TaskError variant via a test-only accessor on the global banner and match structurally. The same caveat applies to predicates like query_by_label_contains("Error") / "Dismiss" used as state checks in phase_03_platform.rs:130-134, phase_04_tokens.rs, and phase_06_dpns.rs — they can resolve completion off ambient/stale UI banners.

source: ['claude']

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Verified two findings against commit 6a43502. Phase 5 sub-test C is a real bug: it asserts on a label string ("Error registering identity") that exists nowhere in source — AddNewIdentityScreen::display_message only resets the step and never renders a banner, so this sub-test will panic when the ignored frontend-e2e suite runs. The substring-based error classification in the test harness is a valid soft architectural concern flagged by both agents. Prior blockers (Add Contracts selector, emergency_cleanup, testing-gated module visibility) are resolved.

Reviewed commit: 6a43502

🔴 1 blocking | 🟡 1 suggestion(s)

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_05_identity.rs`:
- [BLOCKING] lines 77-95: Phase 5 sub-test C asserts on a label that no screen renders
  Sub-test C calls `screen.display_message("Simulated identity error", MessageType::Error)` and then asserts on `query_by_label_contains("Error registering identity")`. Two problems make this assertion impossible to satisfy:

1. `AddNewIdentityScreen::display_message` (`src/ui/identities/add_new_identity_screen/mod.rs:1060-1066`) ignores the message text entirely; it only resets `step` to `ReadyToCreate` and explicitly comments that error display is handled by the global `MessageBanner`. In production the banner is created by `AppState` when it handles a `TaskResult::Error` — not by calling the screen method directly.
2. The asserted text `"Error registering identity"` exists nowhere in the source tree (verified by repo-wide grep) — only in this test file. So even if a banner were created, the assertion would still fail.

The result: every time the ignored frontend-e2e suite reaches Phase 5 sub-test C it will panic. The fix is to publish a global banner via `MessageBanner::set_global(...)` and assert on its actual rendered text (e.g. `"Simulated identity error"`), instead of relying on the screen method to produce visible UI.

In `tests/frontend-e2e/helpers/harness.rs`:
- [SUGGESTION] lines 61-108: Retry classification still parses user-facing error copy
  `classify_error()` lowercases the captured banner text and substring-matches against tokens like `"timeout"`, `"invalid"`, `"consensus"`, `"internal error"`. `handle_retry_error()` then drives retry vs. fatal-panic decisions from that classification, and several phase predicates use broad label substrings (e.g. `query_by_label_contains("Error")` / `"Dismiss"`) as completion signals (`phase_03_platform.rs:46-55, 130-134`, `phase_04_tokens.rs:42-48`, `phase_06_dpns.rs:97-100`). Two consequences: (a) any harmless reword of a `TaskError` `Display` string — exactly the kind of change CLAUDE.md's error-message rules encourage — can silently flip a retryable failure into a fatal one or vice versa; (b) stale banners from prior phases can satisfy a wait predicate before the current operation has actually completed. CLAUDE.md's "never parse error strings" guidance was written for precisely this brittleness. Longer-term fix: expose the last `TaskError` variant (or a structured banner kind) via a test-only accessor on the global banner state, and have `classify_error` and phase predicates match on the typed value.

Comment on lines +77 to +95
push_screen(harness, ScreenType::AddNewIdentity);
with_identity_screen_mut(harness, |screen| {
screen.display_message("Simulated identity error", MessageType::Error);
});
harness.run_steps(POLL_STEPS);
assert!(
harness
.query_by_label_contains("Error registering identity")
.is_some(),
"Error message must be visible after display_message(Error)"
);
dismiss_if_present(harness);
harness.run_steps(SETTLE_STEPS);
assert!(
harness
.query_by_label_contains("Error registering identity")
.is_none(),
"Error message must be gone after dismiss"
);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking: Phase 5 sub-test C asserts on a label that no screen renders

Sub-test C calls screen.display_message("Simulated identity error", MessageType::Error) and then asserts on query_by_label_contains("Error registering identity"). Two problems make this assertion impossible to satisfy:

  1. AddNewIdentityScreen::display_message (src/ui/identities/add_new_identity_screen/mod.rs:1060-1066) ignores the message text entirely; it only resets step to ReadyToCreate and explicitly comments that error display is handled by the global MessageBanner. In production the banner is created by AppState when it handles a TaskResult::Error — not by calling the screen method directly.
  2. The asserted text "Error registering identity" exists nowhere in the source tree (verified by repo-wide grep) — only in this test file. So even if a banner were created, the assertion would still fail.

The result: every time the ignored frontend-e2e suite reaches Phase 5 sub-test C it will panic. The fix is to publish a global banner via MessageBanner::set_global(...) and assert on its actual rendered text (e.g. "Simulated identity error"), instead of relying on the screen method to produce visible UI.

source: ['codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_05_identity.rs`:
- [BLOCKING] lines 77-95: Phase 5 sub-test C asserts on a label that no screen renders
  Sub-test C calls `screen.display_message("Simulated identity error", MessageType::Error)` and then asserts on `query_by_label_contains("Error registering identity")`. Two problems make this assertion impossible to satisfy:

1. `AddNewIdentityScreen::display_message` (`src/ui/identities/add_new_identity_screen/mod.rs:1060-1066`) ignores the message text entirely; it only resets `step` to `ReadyToCreate` and explicitly comments that error display is handled by the global `MessageBanner`. In production the banner is created by `AppState` when it handles a `TaskResult::Error` — not by calling the screen method directly.
2. The asserted text `"Error registering identity"` exists nowhere in the source tree (verified by repo-wide grep) — only in this test file. So even if a banner were created, the assertion would still fail.

The result: every time the ignored frontend-e2e suite reaches Phase 5 sub-test C it will panic. The fix is to publish a global banner via `MessageBanner::set_global(...)` and assert on its actual rendered text (e.g. `"Simulated identity error"`), instead of relying on the screen method to produce visible UI.

Comment thread tests/frontend-e2e/helpers/harness.rs Outdated
Comment on lines +61 to +108
const ERROR_PATTERNS: &[(ErrorCategory, &[&str])] = &[
(
ErrorCategory::Network,
&[
"timeout",
"connection",
"network",
"unavailable",
"timed out",
"refused",
"unreachable",
],
),
(
ErrorCategory::TransientPlatform,
&[
"consensus",
"retry",
"temporarily",
"try again",
"rate limit",
"internal error",
"transport error",
],
),
(
ErrorCategory::Validation,
&[
"invalid",
"insufficient",
"already exists",
"not found",
"duplicate",
"too low",
"too high",
],
),
];

pub fn classify_error(error_text: &str) -> ErrorCategory {
let lower = error_text.to_lowercase();
for (category, patterns) in ERROR_PATTERNS {
if patterns.iter().any(|p| lower.contains(p)) {
return *category;
}
}
ErrorCategory::Fatal
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Retry classification still parses user-facing error copy

classify_error() lowercases the captured banner text and substring-matches against tokens like "timeout", "invalid", "consensus", "internal error". handle_retry_error() then drives retry vs. fatal-panic decisions from that classification, and several phase predicates use broad label substrings (e.g. query_by_label_contains("Error") / "Dismiss") as completion signals (phase_03_platform.rs:46-55, 130-134, phase_04_tokens.rs:42-48, phase_06_dpns.rs:97-100). Two consequences: (a) any harmless reword of a TaskError Display string — exactly the kind of change CLAUDE.md's error-message rules encourage — can silently flip a retryable failure into a fatal one or vice versa; (b) stale banners from prior phases can satisfy a wait predicate before the current operation has actually completed. CLAUDE.md's "never parse error strings" guidance was written for precisely this brittleness. Longer-term fix: expose the last TaskError variant (or a structured banner kind) via a test-only accessor on the global banner state, and have classify_error and phase predicates match on the typed value.

source: ['claude', 'codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/helpers/harness.rs`:
- [SUGGESTION] lines 61-108: Retry classification still parses user-facing error copy
  `classify_error()` lowercases the captured banner text and substring-matches against tokens like `"timeout"`, `"invalid"`, `"consensus"`, `"internal error"`. `handle_retry_error()` then drives retry vs. fatal-panic decisions from that classification, and several phase predicates use broad label substrings (e.g. `query_by_label_contains("Error")` / `"Dismiss"`) as completion signals (`phase_03_platform.rs:46-55, 130-134`, `phase_04_tokens.rs:42-48`, `phase_06_dpns.rs:97-100`). Two consequences: (a) any harmless reword of a `TaskError` `Display` string — exactly the kind of change CLAUDE.md's error-message rules encourage — can silently flip a retryable failure into a fatal one or vice versa; (b) stale banners from prior phases can satisfy a wait predicate before the current operation has actually completed. CLAUDE.md's "never parse error strings" guidance was written for precisely this brittleness. Longer-term fix: expose the last `TaskError` variant (or a structured banner kind) via a test-only accessor on the global banner state, and have `classify_error` and phase predicates match on the typed value.

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Commit 9534f81 resolves all prior blockers (Phase 03 role-constrained selector, Phase 05 banner assertion, emergency_cleanup wallet_reused symmetry, identities cfg-gating, and structural TaskErrorCategory replacing substring matching). Remaining findings are non-blocking quality/test-infra improvements: the testing AppState still touches the real user data dir, the new backend-e2e target lacks a Cargo.toml gate, its README omits required serialization, and a few minor Rust-quality nits.

Reviewed commit: 9534f81

🟡 4 suggestion(s) | 💬 4 nitpick(s)

3 additional findings

🟡 suggestion: `testing` AppState constructor still mutates the real user data directory

src/app.rs (lines 217-225)

The #[cfg(feature = "testing")] constructor switched the database to in-memory, but still resolves app_user_data_dir_path(), calls ensure_data_dir_exists(), and ensure_env_file() before building the in-memory DB. On a developer machine, every kittest/frontend-E2E run therefore creates or mutates the real Dash Evo Tool app-support tree and can drop a bundled .env file there (ensure_env_file writes the file in src/app_dir.rs). Hidden filesystem side effects defeat the isolation the in-memory DB was meant to provide. Default to a temp/test-specific directory unless an env var explicitly opts in.

💡 Suggested change
        let data_dir = std::env::var_os("DASH_EVO_DATA_DIR")
            .map(PathBuf::from)
            .unwrap_or_else(|| std::env::temp_dir().join("dash-evo-tool-testing"));
        ensure_data_dir_exists(&data_dir)?;
        ensure_env_file(&data_dir);
🟡 suggestion: Documented backend E2E command omits the serialization the shared funding harness requires

tests/backend-e2e/README.md (lines 14-19)

The README tells users to run all backend E2Es with cargo test --test backend-e2e --all-features -- --ignored --nocapture, which leaves Cargo's default test parallelism enabled. That is unsafe for this harness: every test wallet is created by spending from the single shared framework wallet in the global OnceCell context (tests/backend-e2e/framework/harness.rs), and there is no mutex around run_task(...SendWalletPayment...) or the follow-up wait for the change output to settle. Two #[ignore] tests starting in parallel race on the same funding wallet and produce nondeterministic failures. tests/backend-e2e/main.rs:7 already documents the correct --test-threads=1 form — the README should match.

💡 Suggested change
# Run all backend E2E tests
cargo test --test backend-e2e --all-features -- --ignored --nocapture --test-threads=1

# Run a single test
cargo test --test backend-e2e --all-features -- --ignored --nocapture --test-threads=1 test_create_identity
🟡 suggestion: Backend E2E suite is still not executed in CI

.github/workflows/tests.yml (lines 90-99)

This PR adds a substantial live-network backend harness under tests/backend-e2e/, but the workflow wiring for it remains commented out with a TODO referencing the TaskError migration. The frontend E2E suite gained a dedicated --ignored step in this PR; the backend suite needs the same treatment once that blocker clears. Until then, regressions in the funding/cleanup flow can merge unnoticed. The TODO is already explicit about the reason, so this is a follow-up reminder rather than an immediate action.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `Cargo.toml`:
- [SUGGESTION] lines 118-126: `backend-e2e` test target missing `required-features = ["testing"]`
  `frontend-e2e` and `kittest` are gated behind `required-features = ["testing"]`, but there is no matching `[[test]]` entry for the new `backend-e2e` binary. That suite imports `dash_evo_tool::database::test_helpers::create_database_at_path` (tests/backend-e2e/framework/harness.rs:25), which is itself gated behind `#[cfg(any(test, feature = "testing"))]` at src/database/mod.rs:13. Without the gate, `cargo test --test backend-e2e` (no `--all-features`) compiles part way then fails on unresolved imports; Cargo would otherwise reject the unsupported invocation up front with a clear `required-features` message. Add a `[[test]]` entry for parity with the other integration suites.

In `src/app.rs`:
- [SUGGESTION] lines 217-225: `testing` AppState constructor still mutates the real user data directory
  The `#[cfg(feature = "testing")]` constructor switched the database to in-memory, but still resolves `app_user_data_dir_path()`, calls `ensure_data_dir_exists()`, and `ensure_env_file()` before building the in-memory DB. On a developer machine, every kittest/frontend-E2E run therefore creates or mutates the real Dash Evo Tool app-support tree and can drop a bundled `.env` file there (`ensure_env_file` writes the file in src/app_dir.rs). Hidden filesystem side effects defeat the isolation the in-memory DB was meant to provide. Default to a temp/test-specific directory unless an env var explicitly opts in.

In `tests/backend-e2e/README.md`:
- [SUGGESTION] lines 14-19: Documented backend E2E command omits the serialization the shared funding harness requires
  The README tells users to run all backend E2Es with `cargo test --test backend-e2e --all-features -- --ignored --nocapture`, which leaves Cargo's default test parallelism enabled. That is unsafe for this harness: every test wallet is created by spending from the single shared framework wallet in the global `OnceCell` context (tests/backend-e2e/framework/harness.rs), and there is no mutex around `run_task(...SendWalletPayment...)` or the follow-up wait for the change output to settle. Two `#[ignore]` tests starting in parallel race on the same funding wallet and produce nondeterministic failures. `tests/backend-e2e/main.rs:7` already documents the correct `--test-threads=1` form — the README should match.

In `.github/workflows/tests.yml`:
- [SUGGESTION] lines 90-99: Backend E2E suite is still not executed in CI
  This PR adds a substantial live-network backend harness under `tests/backend-e2e/`, but the workflow wiring for it remains commented out with a TODO referencing the TaskError migration. The frontend E2E suite gained a dedicated `--ignored` step in this PR; the backend suite needs the same treatment once that blocker clears. Until then, regressions in the funding/cleanup flow can merge unnoticed. The TODO is already explicit about the reason, so this is a follow-up reminder rather than an immediate action.

Comment thread Cargo.toml
Comment on lines +118 to +126
[[test]]
name = "frontend-e2e"
path = "tests/frontend-e2e/main.rs"
required-features = ["testing"]

[[test]]
name = "kittest"
path = "tests/kittest/main.rs"
required-features = ["testing"]
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: backend-e2e test target missing required-features = ["testing"]

frontend-e2e and kittest are gated behind required-features = ["testing"], but there is no matching [[test]] entry for the new backend-e2e binary. That suite imports dash_evo_tool::database::test_helpers::create_database_at_path (tests/backend-e2e/framework/harness.rs:25), which is itself gated behind #[cfg(any(test, feature = "testing"))] at src/database/mod.rs:13. Without the gate, cargo test --test backend-e2e (no --all-features) compiles part way then fails on unresolved imports; Cargo would otherwise reject the unsupported invocation up front with a clear required-features message. Add a [[test]] entry for parity with the other integration suites.

💡 Suggested change
Suggested change
[[test]]
name = "frontend-e2e"
path = "tests/frontend-e2e/main.rs"
required-features = ["testing"]
[[test]]
name = "kittest"
path = "tests/kittest/main.rs"
required-features = ["testing"]
[[test]]
name = "backend-e2e"
path = "tests/backend-e2e/main.rs"
required-features = ["testing"]

source: ['codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `Cargo.toml`:
- [SUGGESTION] lines 118-126: `backend-e2e` test target missing `required-features = ["testing"]`
  `frontend-e2e` and `kittest` are gated behind `required-features = ["testing"]`, but there is no matching `[[test]]` entry for the new `backend-e2e` binary. That suite imports `dash_evo_tool::database::test_helpers::create_database_at_path` (tests/backend-e2e/framework/harness.rs:25), which is itself gated behind `#[cfg(any(test, feature = "testing"))]` at src/database/mod.rs:13. Without the gate, `cargo test --test backend-e2e` (no `--all-features`) compiles part way then fails on unresolved imports; Cargo would otherwise reject the unsupported invocation up front with a clear `required-features` message. Add a `[[test]]` entry for parity with the other integration suites.

Comment thread src/backend_task/error.rs Outdated
Comment on lines +956 to +957
| TaskError::ShieldedNullifierSyncFailed { .. } => TaskErrorCategory::Network,

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: TaskError::MustRetry arm in test_category() is unreachable

AppState::update matches TaskResult::Error(TaskError::MustRetry(msg)) before the generic TaskResult::Error(err) arm (src/app.rs:1346 vs. :1354), and only the generic arm calls err.test_category(). The TaskError::MustRetry(_) => TaskErrorCategory::TransientPlatform branch added here can never run. It is also semantically misleading: MustRetry is rendered as a success banner, not a retryable error. Either drop the variant from the match (the exhaustiveness check still holds for the variants test_category is actually invoked on) or add a brief comment that it is kept only for exhaustiveness.

source: ['claude']

Comment on lines +36 to +38
pub fn classify_error(harness: &Harness<'_, AppState>) -> Option<TaskErrorCategory> {
MessageBanner::last_global_task_error_category(harness.state().current_app_context().egui_ctx())
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: classify_error only sees errors flowed through TaskResult::Error

MessageBanner::last_global_task_error_category is populated only by AppState::update's TaskResult::Error(err) arm (src/app.rs:1356). Many production sites call MessageBanner::set_global(..., MessageType::Error) directly outside the task-result path (parse-failure branches, screen-side validation in identity/token/wallet screens). After such a banner appears, classify_error(harness) returns None, so wait_until predicates may time out instead of detecting the error or retrying. The current e2e hot path goes through BackendTaskTaskResult::Error, so this works today, but the gap will silently grow. Consider having MessageBanner::set_global(Error) record a default category, or routing screen-side errors through a typed sentinel.

source: ['claude']

Comment thread src/ui/mod.rs
Comment on lines +90 to 93
#[cfg(any(test, feature = "testing"))]
pub mod identities;
#[cfg(not(any(test, feature = "testing")))]
pub(crate) mod identities;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Paired #[cfg] declarations for identities could be simplified

The duplicated #[cfg(any(test, feature = "testing"))] pub mod identities; / #[cfg(not(...))] pub(crate) mod identities; pattern works but is more verbose than necessary, and exposes the entire identities UI tree as crate public API under testing. A single pub(crate) mod identities; with narrowly-scoped #[cfg(feature = "testing")] pub use ... re-exports for the types the test target actually consumes would be tighter.

source: ['claude']

Comment on lines +77 to +101
// ─── Sub-test C: Error message display and dismiss ──────────────────
push_screen(harness, ScreenType::AddNewIdentity);
clear_global_banners(harness);
MessageBanner::set_global(
harness.state().current_app_context().egui_ctx(),
"Simulated identity error",
MessageType::Error,
);
harness.run_steps(POLL_STEPS);
assert!(
harness
.query_by_label_contains("Simulated identity error")
.is_some(),
"Error banner must be visible after MessageBanner::set_global(Error)"
);
dismiss_if_present(harness);
harness.run_steps(SETTLE_STEPS);
assert!(
harness
.query_by_label_contains("Simulated identity error")
.is_none(),
"Error message must be gone after dismiss"
);
println!(" Validation: error message displayed and dismissed");
pop_screen(harness);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Sub-test C overlaps with kittest banner coverage

Sub-test C now drives MessageBanner::set_global directly and asserts on the banner text — which is already covered by tests/kittest/message_banner.rs::test_global_set_and_has. It does still exercise the harness's clear_global_banners / dismiss_if_present helpers and the dismiss flow in the e2e context, and Sub-test D right below covers the actual AddNewIdentityScreen::display_message contract, so the duplication is small. Worth a quick look to see whether Sub-test C can be folded into Sub-test D, but not required.

source: ['claude']

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The security finding is valid: the workflow places the funded E2E wallet mnemonic in workflow-level env, so it is available to the ordinary build/test steps before the E2E tests. The two test/architecture concerns also apply to this head: AppState::new is replaced under the additive testing feature, and the new frontend E2E setup imports the wallet through the model/context layer rather than the UI path it claims to cover.

Reviewed commit: d56b298

🔴 1 blocking | 🟡 2 suggestion(s)

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `.github/workflows/tests.yml`:
- [BLOCKING] line 33: Do not expose the funded E2E mnemonic to the whole CI job
  `E2E_WALLET_MNEMONIC` is set at workflow scope, which makes the funded wallet seed available to every step in the job, including checkout-time/build-time code and the ordinary `cargo test --all-features --workspace` and doctest steps before the E2E tests run. Since this mnemonic controls a pre-funded wallet, keep it scoped only to the specific frontend/backend E2E and cleanup steps that need it, and perform the skip checks inside those steps rather than exposing the secret globally.

In `src/app.rs`:
- [SUGGESTION] lines 221-230: `testing` feature replaces the production AppState constructor
  When the additive `testing` feature is enabled, the production `AppState::new` is not compiled and the same public constructor name instead creates a temporary data directory and in-memory database. This means `cargo test --all-features` no longer type-checks the production constructor, and `cargo run --all-features`/debug packaging builds can silently start the GUI against a throwaway test database. Prefer keeping the production `new` compiled under all feature combinations and adding a separate `new_for_testing` or similar constructor for the harnesses.

In `tests/frontend-e2e/phases/phase_00_setup.rs`:
- [SUGGESTION] lines 35-69: Frontend E2E wallet import bypasses the UI action path
  The setup phase parses the mnemonic, builds a `Wallet`, and registers it directly through `AppContext::register_wallet()` instead of driving `ImportMnemonicScreen` and the app action/task-result flow that users exercise. As a result, regressions in the import screen's egui state, validation, password handling, routing, or result display would not be caught by this E2E journey. Either drive the import UI here, or make this a clearly scoped model-level setup helper and add separate UI coverage for the real import path.

Comment thread .github/workflows/tests.yml Outdated

env:
CARGO_TERM_COLOR: always
E2E_WALLET_MNEMONIC: ${{ secrets.E2E_WALLET_MNEMONIC }}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking: Do not expose the funded E2E mnemonic to the whole CI job

E2E_WALLET_MNEMONIC is set at workflow scope, which makes the funded wallet seed available to every step in the job, including checkout-time/build-time code and the ordinary cargo test --all-features --workspace and doctest steps before the E2E tests run. Since this mnemonic controls a pre-funded wallet, keep it scoped only to the specific frontend/backend E2E and cleanup steps that need it, and perform the skip checks inside those steps rather than exposing the secret globally.

source: ['codex-general']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `.github/workflows/tests.yml`:
- [BLOCKING] line 33: Do not expose the funded E2E mnemonic to the whole CI job
  `E2E_WALLET_MNEMONIC` is set at workflow scope, which makes the funded wallet seed available to every step in the job, including checkout-time/build-time code and the ordinary `cargo test --all-features --workspace` and doctest steps before the E2E tests run. Since this mnemonic controls a pre-funded wallet, keep it scoped only to the specific frontend/backend E2E and cleanup steps that need it, and perform the skip checks inside those steps rather than exposing the secret globally.

Comment thread src/app.rs Outdated
Comment on lines 221 to 230
@@ -225,6 +230,30 @@ impl AppState {
Self::new_inner(ctx, db, data_dir)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: testing feature replaces the production AppState constructor

When the additive testing feature is enabled, the production AppState::new is not compiled and the same public constructor name instead creates a temporary data directory and in-memory database. This means cargo test --all-features no longer type-checks the production constructor, and cargo run --all-features/debug packaging builds can silently start the GUI against a throwaway test database. Prefer keeping the production new compiled under all feature combinations and adding a separate new_for_testing or similar constructor for the harnesses.

source: ['claude-rust-quality']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/app.rs`:
- [SUGGESTION] lines 221-230: `testing` feature replaces the production AppState constructor
  When the additive `testing` feature is enabled, the production `AppState::new` is not compiled and the same public constructor name instead creates a temporary data directory and in-memory database. This means `cargo test --all-features` no longer type-checks the production constructor, and `cargo run --all-features`/debug packaging builds can silently start the GUI against a throwaway test database. Prefer keeping the production `new` compiled under all feature combinations and adding a separate `new_for_testing` or similar constructor for the harnesses.

Comment on lines +35 to +69
/// Import a wallet through the AppContext/model layer (no UI screen setters).
/// Parses the mnemonic, creates a Wallet, and registers it via AppContext.
fn import_wallet_via_context(
harness: &mut Harness<'_, AppState>,
ctx: &mut TestContext,
words: &[&str],
) {
let initial_wallet_keys: BTreeSet<WalletSeedHash> = {
let app_ctx = harness.state().current_app_context();
app_ctx.wallets().read().unwrap().keys().copied().collect()
};

let phrase = words.join(" ");
let mnemonic = bip39::Mnemonic::parse_normalized(&phrase)
.unwrap_or_else(|e| panic!("Invalid mnemonic: {}", e));
let seed = mnemonic.to_seed("");

let app_ctx = harness.state().current_app_context().clone();
let wallet = Wallet::new_from_seed(
seed,
app_ctx.network(),
Some(E2E_WALLET_ALIAS.to_string()),
None,
)
.unwrap_or_else(|e| panic!("Wallet creation failed: {}", e));

println!(
" Parsed {} mnemonic words, alias '{}'",
words.len(),
E2E_WALLET_ALIAS
);

let (new_seed_hash, _wallet_arc) = app_ctx
.register_wallet(wallet)
.unwrap_or_else(|e| panic!("Wallet registration failed: {}", e));
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Frontend E2E wallet import bypasses the UI action path

The setup phase parses the mnemonic, builds a Wallet, and registers it directly through AppContext::register_wallet() instead of driving ImportMnemonicScreen and the app action/task-result flow that users exercise. As a result, regressions in the import screen's egui state, validation, password handling, routing, or result display would not be caught by this E2E journey. Either drive the import UI here, or make this a clearly scoped model-level setup helper and add separate UI coverage for the real import path.

source: ['claude-general', 'claude-rust-quality']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_00_setup.rs`:
- [SUGGESTION] lines 35-69: Frontend E2E wallet import bypasses the UI action path
  The setup phase parses the mnemonic, builds a `Wallet`, and registers it directly through `AppContext::register_wallet()` instead of driving `ImportMnemonicScreen` and the app action/task-result flow that users exercise. As a result, regressions in the import screen's egui state, validation, password handling, routing, or result display would not be caught by this E2E journey. Either drive the import UI here, or make this a clearly scoped model-level setup helper and add separate UI coverage for the real import path.

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

All prior blocking findings were resolved at 344716e (CI secret scoping, production AppState constructor restored, backend-e2e gating, testing data dir isolation, README serialization note, frontend wallet-import UI coverage). Remaining items are suggestions and nitpicks: a string-parse retry in backend E2E DPNS that violates the CLAUDE.md typed-error rule, a real architectural gap where direct MessageBanner::set_global(Error) calls bypass the test-only category atomic, a dead MustRetry arm in test_category(), a dead wallet-reuse branch in phase_00 (in-memory DB defeats reuse), and minor test/Cargo hygiene nits.

Reviewed commit: 344716e

🟡 2 suggestion(s) | 💬 7 nitpick(s)

1 additional finding

🟡 suggestion: Backend E2E DPNS retry parses user-facing error text instead of matching a typed variant

tests/backend-e2e/register_dpns.rs (lines 53-65)

err_str.contains("not found") is used to decide whether to retry DPNS registration. CLAUDE.md explicitly forbids this: "Never parse error strings to extract information. Always use the typed error chain." The wording is also unstable — TaskError::IdentityNotFound renders as "Identity not found on the platform. Please check the ID or name and try again." today, but any future copy edit (including i18n) will silently change retry behaviour with no compiler error. Match on the concrete TaskError variant(s) that represent identity-propagation lag instead. Verify which variant register_dpns_name actually returns when the identity is not yet visible (likely TaskError::IdentityNotFound or an SDK error wrapped in TaskError::SdkError/PlatformFetchError) before settling on a replacement; if no existing variant matches the propagation-lag case cleanly, add a dedicated one rather than re-introducing string matching.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/backend-e2e/register_dpns.rs`:
- [SUGGESTION] lines 53-65: Backend E2E DPNS retry parses user-facing error text instead of matching a typed variant
  `err_str.contains("not found")` is used to decide whether to retry DPNS registration. CLAUDE.md explicitly forbids this: *"Never parse error strings to extract information. Always use the typed error chain."* The wording is also unstable — `TaskError::IdentityNotFound` renders as "Identity not found on the platform. Please check the ID or name and try again." today, but any future copy edit (including i18n) will silently change retry behaviour with no compiler error. Match on the concrete `TaskError` variant(s) that represent identity-propagation lag instead. Verify which variant `register_dpns_name` actually returns when the identity is not yet visible (likely `TaskError::IdentityNotFound` or an SDK error wrapped in `TaskError::SdkError`/`PlatformFetchError`) before settling on a replacement; if no existing variant matches the propagation-lag case cleanly, add a dedicated one rather than re-introducing string matching.

In `src/ui/components/message_banner.rs`:
- [SUGGESTION] lines 327-488: Direct `set_global(..., Error)` calls remain invisible to the frontend-E2E typed classifier
  `MessageBanner::last_global_task_error_category` is written only from `AppState::update`'s generic `TaskResult::Error(err)` arm at `src/app.rs:1378-1380`. Production screens routinely raise error banners outside that flow — for example `AddContractsScreen::add_contracts_clicked` (src/ui/contracts_documents/add_contracts_screen.rs:96) and the invalid-index branch in `AddExistingIdentityScreen` (src/ui/identities/add_existing_identity_screen.rs:715) — via direct `MessageBanner::set_global(..., MessageType::Error)` calls. After such a banner appears, the E2E harness's `classify_error()` returns `None`, so `wait_until` predicates in phase_03/04/06 keep polling until timeout instead of recognizing the visible failure. The current happy path stays inside `BackendTask → TaskResult::Error`, so this works today, but the gap will silently widen as screen-side validation grows. Have `set_global(..., Error)` and `set_global_with_error()` record at least a default category (e.g. `Unknown`/`ScreenSide`) so any rendered error banner is observable to the harness.

Comment on lines 472 to +488
banners.retain_mut(|b| process_banner(ui, b) == BannerStatus::Visible);
set_banners(ui.ctx(), banners);
}

#[cfg(any(test, feature = "testing"))]
pub fn set_last_global_task_error_category(ctx: &egui::Context, category: TaskErrorCategory) {
ctx.data_mut(|d| d.insert_temp(egui::Id::new(TASK_ERROR_CATEGORY_STATE_ID), category));
}

#[cfg(any(test, feature = "testing"))]
pub fn last_global_task_error_category(ctx: &egui::Context) -> Option<TaskErrorCategory> {
ctx.data(|d| d.get_temp::<TaskErrorCategory>(egui::Id::new(TASK_ERROR_CATEGORY_STATE_ID)))
}

#[cfg(any(test, feature = "testing"))]
pub fn clear_last_global_task_error_category(ctx: &egui::Context) {
clear_last_global_task_error_category(ctx);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Direct set_global(..., Error) calls remain invisible to the frontend-E2E typed classifier

MessageBanner::last_global_task_error_category is written only from AppState::update's generic TaskResult::Error(err) arm at src/app.rs:1378-1380. Production screens routinely raise error banners outside that flow — for example AddContractsScreen::add_contracts_clicked (src/ui/contracts_documents/add_contracts_screen.rs:96) and the invalid-index branch in AddExistingIdentityScreen (src/ui/identities/add_existing_identity_screen.rs:715) — via direct MessageBanner::set_global(..., MessageType::Error) calls. After such a banner appears, the E2E harness's classify_error() returns None, so wait_until predicates in phase_03/04/06 keep polling until timeout instead of recognizing the visible failure. The current happy path stays inside BackendTask → TaskResult::Error, so this works today, but the gap will silently widen as screen-side validation grows. Have set_global(..., Error) and set_global_with_error() record at least a default category (e.g. Unknown/ScreenSide) so any rendered error banner is observable to the harness.

source: ['claude', 'codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/ui/components/message_banner.rs`:
- [SUGGESTION] lines 327-488: Direct `set_global(..., Error)` calls remain invisible to the frontend-E2E typed classifier
  `MessageBanner::last_global_task_error_category` is written only from `AppState::update`'s generic `TaskResult::Error(err)` arm at `src/app.rs:1378-1380`. Production screens routinely raise error banners outside that flow — for example `AddContractsScreen::add_contracts_clicked` (src/ui/contracts_documents/add_contracts_screen.rs:96) and the invalid-index branch in `AddExistingIdentityScreen` (src/ui/identities/add_existing_identity_screen.rs:715) — via direct `MessageBanner::set_global(..., MessageType::Error)` calls. After such a banner appears, the E2E harness's `classify_error()` returns `None`, so `wait_until` predicates in phase_03/04/06 keep polling until timeout instead of recognizing the visible failure. The current happy path stays inside `BackendTask → TaskResult::Error`, so this works today, but the gap will silently widen as screen-side validation grows. Have `set_global(..., Error)` and `set_global_with_error()` record at least a default category (e.g. `Unknown`/`ScreenSide`) so any rendered error banner is observable to the harness.

Comment on lines +121 to +158
// 4. Check if wallet already exists (idempotent re-run support)
if let Some(seed_hash) = find_existing_e2e_wallet(harness, &expected_seed_hash) {
ctx.wallet_seed_hash = Some(seed_hash);
ctx.wallet_reused = true;
println!(
" Reusing existing wallet. Seed hash prefix: {}",
seed_hash_prefix(&seed_hash)
);

// Unlock the wallet so SPV can register its addresses.
// Wallets loaded from DB are in a closed state — SPV needs the
// seed bytes to build the bloom filter and discover transactions.
let app_ctx = harness.state().current_app_context().clone();
let wallet_arc = {
let wallets = app_ctx.wallets().read().unwrap();
wallets.get(&seed_hash).cloned()
};
// Drop wallets lock before calling bootstrap/unlock methods
if let Some(wallet_arc) = wallet_arc {
{
let mut w = wallet_arc.write().unwrap();
if !w.is_open() {
w.wallet_seed
.open_no_password()
.unwrap_or_else(|e| panic!("Failed to unlock reused wallet: {}", e));
println!(" Wallet unlocked (no password)");
} else {
println!(" Wallet already open");
}
}
app_ctx.bootstrap_wallet_addresses(&wallet_arc);
app_ctx.handle_wallet_unlocked(&wallet_arc);
println!(" Wallet bootstrapped for SPV");
}
} else {
// 5-10. Create the funded wallet fixture via the model layer.
import_wallet_via_context(harness, ctx, &words);
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Wallet-reuse branch in phase 0 is unreachable with the current testing harness

find_existing_e2e_wallet and the entire "reuse" branch (including the bootstrap/unlock fallthrough and the ctx.wallet_reused = true flag consumed by emergency-cleanup) cannot trigger today: the harness builds AppState::new_for_testing() (tests/frontend-e2e/helpers/harness.rs:48) which uses create_test_database() (an in-memory SQLite DB at src/app.rs:222) plus a fresh per-process temp data dir (src/app.rs:228-250). Each frontend-E2E process therefore starts with zero wallets — find_existing_e2e_wallet always returns None, and wallet_reused always stays false. The teardown messaging that claims to preserve a reused wallet is also misleading. Either delete the dead code or switch the suite to a persistent test database if reuse across runs is actually wanted.

source: ['codex']

Comment thread src/backend_task/error.rs
| TaskError::ShieldedSyncFailed { .. }
| TaskError::ShieldedNullifierSyncFailed { .. } => TaskErrorCategory::Network,

TaskError::MustRetry(_)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: TaskError::MustRetry arm in test_category() is unreachable

AppState::update matches TaskResult::Error(TaskError::MustRetry(msg)) at src/app.rs:1370 and clears the category atomic; only the generic TaskResult::Error(err) arm at src/app.rs:1378 calls err.test_category(). So the TaskError::MustRetry(_) => TaskErrorCategory::TransientPlatform branch never executes. It is also semantically misleading because MustRetry is rendered as a Success banner, not a retryable error. Either drop MustRetry(_) from the match arm or add a comment that it is kept solely for exhaustiveness.

source: ['claude', 'codex']

Comment on lines +438 to +448
harness
.query_all_by_role(egui::accesskit::Role::TextInput)
.nth(12)
.expect("Alias text input must exist after the 12 mnemonic inputs")
.click();
harness.run_steps(5);
harness
.query_all_by_role(egui::accesskit::Role::TextInput)
.nth(12)
.expect("Alias text input must still exist after clicking")
.type_text(TEST_IMPORT_ALIAS);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Alias TextInput selected positionally via nth(12) is fragile to screen changes

test_import_mnemonic_screen_imports_wallet_via_ui finds the alias field by indexing into the TextInput collection: .query_all_by_role(Role::TextInput).nth(12). This assumes a stable order of "12 mnemonic words then alias". Any future addition of a TextInput on the import screen (passphrase, derivation override, etc.) silently shifts the index — the test would still pass because it would write TEST_IMPORT_ALIAS into the wrong widget, then fail later in a confusing way. The 12 word inputs are inherently positional, but the alias is auxiliary and should be addressed independently. ImportMnemonicScreen already sets hint_text("Wallet Alias"), so a query_by_placeholder/query_by_label lookup would be both robust and self-documenting.

source: ['claude']

Comment on lines +35 to +95
/// Import a funded wallet fixture through the AppContext/model layer.
/// This setup path is intentionally non-UI so Phase 0 can create a reusable
/// funded wallet quickly; UI coverage for mnemonic import lives in
/// `tests/kittest/interactions.rs`.
fn import_wallet_via_context(
harness: &mut Harness<'_, AppState>,
ctx: &mut TestContext,
words: &[&str],
) {
let initial_wallet_keys: BTreeSet<WalletSeedHash> = {
let app_ctx = harness.state().current_app_context();
app_ctx.wallets().read().unwrap().keys().copied().collect()
};

let phrase = words.join(" ");
let mnemonic = bip39::Mnemonic::parse_normalized(&phrase)
.unwrap_or_else(|e| panic!("Invalid mnemonic: {}", e));
let seed = mnemonic.to_seed("");

let app_ctx = harness.state().current_app_context().clone();
let wallet = Wallet::new_from_seed(
seed,
app_ctx.network(),
Some(E2E_WALLET_ALIAS.to_string()),
None,
)
.unwrap_or_else(|e| panic!("Wallet creation failed: {}", e));

println!(
" Parsed {} mnemonic words, alias '{}'",
words.len(),
E2E_WALLET_ALIAS
);

let (new_seed_hash, _wallet_arc) = app_ctx
.register_wallet(wallet)
.unwrap_or_else(|e| panic!("Wallet registration failed: {}", e));
println!(" Wallet registered via AppContext");

// Let the UI pick up the new wallet
harness.run_steps(SETTLE_STEPS);

// Verify wallet appeared
{
let wallets = app_ctx.wallets().read().unwrap();
assert!(
wallets.len() > initial_wallet_keys.len(),
"Wallet count didn't increase after registration (still {})",
initial_wallet_keys.len()
);
assert!(
wallets.contains_key(&new_seed_hash),
"Registered wallet not found in AppContext"
);
ctx.wallet_seed_hash = Some(new_seed_hash);
}
println!(
" Wallet imported. Seed hash prefix: {}",
seed_hash_prefix(ctx.seed_hash())
);
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Frontend E2E "full journey" still bypasses the wallet-import UI

Phase 0 explicitly imports the funded mnemonic by constructing a Wallet and calling AppContext::register_wallet() directly, so this suite never drives ImportMnemonicScreen, its validation, button actions, or the AppAction/task-result plumbing a real user goes through. The new kittest test_import_mnemonic_screen_imports_wallet_via_ui covers the import form in isolation, and the comment at lines 35-38 already documents this is intentional fixture-setup. That makes this a docs-framing concern rather than a coverage gap — but PR descriptions and any future docs that frame this as an end-to-end "wallet-import → SPV sync → ..." journey should be adjusted to say "funded-fixture setup → SPV sync → ..." so readers don't overcount UI coverage.

source: ['codex']

Comment thread src/app.rs
Comment on lines +228 to +250
#[cfg(feature = "testing")]
fn testing_data_dir() -> Result<PathBuf, std::io::Error> {
if std::env::var(TESTING_USE_REAL_DATA_DIR_ENV)
.map(|value| matches!(value.as_str(), "1" | "true" | "TRUE" | "yes" | "YES"))
.unwrap_or(false)
{
return app_user_data_dir_path();
}

if let Ok(dir) = std::env::var("DASH_EVO_DATA_DIR") {
return Ok(PathBuf::from(dir));
}

let unique = SystemTime::now()
.duration_since(UNIX_EPOCH)
.map_err(std::io::Error::other)?
.as_nanos();
Ok(std::env::temp_dir().join(format!(
"dash-evo-tool-testing-{}-{}",
std::process::id(),
unique
)))
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: testing_data_dir() leaks a fresh temp dir per harness with no cleanup

Each call to AppState::new_for_testing produces a new temp_dir().join("dash-evo-tool-testing-{pid}-{nanos}") directory and asks ensure_data_dir_exists/ensure_env_file to materialize it. There is no RAII cleanup, no TempDir guard, and no reuse — /tmp will accumulate one directory per kittest/E2E invocation locally and in CI until the OS evicts them. Using tempfile::TempDir owned by AppState (or a test-only Drop wrapper around the path) would make cleanup automatic. Not a correctness issue, but worth tightening on a path that runs frequently.

source: ['claude']

Comment thread src/ui/mod.rs
Comment on lines +90 to 93
#[cfg(any(test, feature = "testing"))]
pub mod identities;
#[cfg(not(any(test, feature = "testing")))]
pub(crate) mod identities;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Paired #[cfg] for identities exposes the entire UI subtree to the testing feature

#[cfg(any(test, feature = "testing"))] pub mod identities; paired with #[cfg(not(...))] pub(crate) mod identities; widens the crate-public API surface for the entire identities UI subtree whenever testing is on. Replacing it with a single pub(crate) mod identities; plus narrow #[cfg(any(test, feature = "testing"))] pub use identities::{specific items}; re-exports for only the types the test target consumes would be tighter and easier to maintain. Pure hygiene improvement, low priority.

source: ['claude']

Comment on lines +77 to +101
// ─── Sub-test C: Error message display and dismiss ──────────────────
push_screen(harness, ScreenType::AddNewIdentity);
clear_global_banners(harness);
MessageBanner::set_global(
harness.state().current_app_context().egui_ctx(),
"Simulated identity error",
MessageType::Error,
);
harness.run_steps(POLL_STEPS);
assert!(
harness
.query_by_label_contains("Simulated identity error")
.is_some(),
"Error banner must be visible after MessageBanner::set_global(Error)"
);
dismiss_if_present(harness);
harness.run_steps(SETTLE_STEPS);
assert!(
harness
.query_by_label_contains("Simulated identity error")
.is_none(),
"Error message must be gone after dismiss"
);
println!(" Validation: error message displayed and dismissed");
pop_screen(harness);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Phase 5 sub-test C overlaps with kittest banner coverage

Sub-test C now drives MessageBanner::set_global directly and asserts on the banner text, which is also covered by tests/kittest/message_banner.rs::test_global_set_and_has. It still exercises the harness's clear_global_banners/dismiss_if_present helpers and the dismiss flow under e2e conditions, and Sub-test D right below covers AddNewIdentityScreen::display_message resetting the step. Worth considering folding C into D to remove duplication, but not required.

source: ['claude']

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

All earlier blockers (CI secret scoping, separate testing AppState constructor, backend-e2e gating, typed retry classifier, screen-side error banner visibility, etc.) are resolved at 139c77d. Remaining items are non-blocking quality concerns in the new test infrastructure: a missing RUST_MIN_STACK in the new backend-E2E CI steps, an unreachable wallet-reuse path given the in-memory test DB, a temp-dir leak from testing_data_dir, a dead MustRetry arm in test_category, fragile positional alias selector in the import kittest, and overly broad pub-mod gating for ui::identities.

Reviewed commit: 139c77d

🟡 1 suggestion(s)

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `.github/workflows/tests.yml`:
- [SUGGESTION] lines 95-113: Backend E2E CI steps omit the documented `RUST_MIN_STACK` requirement
  Both backend-E2E invocations (`Run backend E2E tests` at line 95 and `Cleanup E2E test wallets` at line 104) run `cargo test --test backend-e2e --all-features -- --ignored --nocapture --test-threads=1` without setting `RUST_MIN_STACK`. The test binary itself documents `RUST_MIN_STACK=16777216` (16 MB) as required because the Dash Platform SDK exceeds the default 8 MB stack during state-transition construction (`tests/backend-e2e/main.rs:7-12`). As soon as the secret is configured and the backend suite actually runs in CI, these steps are liable to fail with stack overflows. The cleanup step has the same problem because it executes the same test binary. Add `RUST_MIN_STACK: 16777216` to both `env:` blocks.

Comment on lines +95 to +113
- name: Run backend E2E tests
env:
E2E_WALLET_MNEMONIC: ${{ secrets.E2E_WALLET_MNEMONIC }}
run: |
if [ -z "${E2E_WALLET_MNEMONIC}" ]; then
echo "Skipping backend E2E tests: E2E_WALLET_MNEMONIC is not set"
exit 0
fi
cargo test --test backend-e2e --all-features -- --ignored --nocapture --test-threads=1
- name: Cleanup E2E test wallets
if: ${{ always() }}
env:
E2E_WALLET_MNEMONIC: ${{ secrets.E2E_WALLET_MNEMONIC }}
run: |
if [ -z "${E2E_WALLET_MNEMONIC}" ]; then
echo "Skipping E2E cleanup: E2E_WALLET_MNEMONIC is not set"
exit 0
fi
cargo test --test backend-e2e --all-features -- --ignored --nocapture --test-threads=1 cleanup_only
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Backend E2E CI steps omit the documented RUST_MIN_STACK requirement

Both backend-E2E invocations (Run backend E2E tests at line 95 and Cleanup E2E test wallets at line 104) run cargo test --test backend-e2e --all-features -- --ignored --nocapture --test-threads=1 without setting RUST_MIN_STACK. The test binary itself documents RUST_MIN_STACK=16777216 (16 MB) as required because the Dash Platform SDK exceeds the default 8 MB stack during state-transition construction (tests/backend-e2e/main.rs:7-12). As soon as the secret is configured and the backend suite actually runs in CI, these steps are liable to fail with stack overflows. The cleanup step has the same problem because it executes the same test binary. Add RUST_MIN_STACK: 16777216 to both env: blocks.

source: ['codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `.github/workflows/tests.yml`:
- [SUGGESTION] lines 95-113: Backend E2E CI steps omit the documented `RUST_MIN_STACK` requirement
  Both backend-E2E invocations (`Run backend E2E tests` at line 95 and `Cleanup E2E test wallets` at line 104) run `cargo test --test backend-e2e --all-features -- --ignored --nocapture --test-threads=1` without setting `RUST_MIN_STACK`. The test binary itself documents `RUST_MIN_STACK=16777216` (16 MB) as required because the Dash Platform SDK exceeds the default 8 MB stack during state-transition construction (`tests/backend-e2e/main.rs:7-12`). As soon as the secret is configured and the backend suite actually runs in CI, these steps are liable to fail with stack overflows. The cleanup step has the same problem because it executes the same test binary. Add `RUST_MIN_STACK: 16777216` to both `env:` blocks.

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The previously reported blockers are resolved on 26c34190640089a53beea9b39e0849dacb6421a7, including CI wiring, test-feature gating, typed backend-E2E retry classification, and the backend-E2E stack-size requirement. Four non-blocking issues remain in the new test infrastructure: two dead-path/resource-lifetime problems in the frontend test harness, one unreachable classifier arm, and one brittle UI selector.

Reviewed commit: 26c3419

🟡 2 suggestion(s) | 💬 2 nitpick(s)

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_00_setup.rs`:
- [SUGGESTION] lines 121-158: Frontend E2E wallet-reuse flow is unreachable under the current harness
  The reuse branch in phase 0 cannot execute with the default frontend-E2E setup. `create_e2e_harness()` always constructs `AppState::new_for_testing()` (`tests/frontend-e2e/helpers/harness.rs:43-50`), and that constructor always uses the in-memory `create_test_database()` path (`src/app.rs:272-283`). Each test process therefore starts with an empty wallet database, so `find_existing_e2e_wallet()` at lines 24-33 always returns `None`, `ctx.wallet_reused` never becomes `true`, and the reuse-only unlock/bootstrap/cleanup paths in this phase, `tests/frontend-e2e/helpers/harness.rs:352-356`, and `tests/frontend-e2e/phases/phase_07_teardown.rs:47-52` are dead code. Either remove the reuse path or switch this suite to a persistent test database mode when cross-run reuse is actually intended.

In `src/app.rs`:
- [SUGGESTION] lines 275-307: `new_for_testing()` creates unique temp data directories without any cleanup owner
  `new_for_testing()` allocates a fresh data directory through `testing_data_dir()`, materializes it with `ensure_data_dir_exists()` and `ensure_env_file()`, and then stores only a `PathBuf` in `AppState`. There is no `TempDir`, drop guard, or other owner responsible for removing these `dash-evo-tool-testing-*` directories after each kittest or frontend-E2E harness instance is dropped. Because this constructor is now the common test entrypoint, repeated local and CI runs will steadily accumulate orphaned directories in the system temp area. Tie the directory to an owning RAII value or reuse an explicit test directory when persistence is desired.

Comment on lines +121 to +158
// 4. Check if wallet already exists (idempotent re-run support)
if let Some(seed_hash) = find_existing_e2e_wallet(harness, &expected_seed_hash) {
ctx.wallet_seed_hash = Some(seed_hash);
ctx.wallet_reused = true;
println!(
" Reusing existing wallet. Seed hash prefix: {}",
seed_hash_prefix(&seed_hash)
);

// Unlock the wallet so SPV can register its addresses.
// Wallets loaded from DB are in a closed state — SPV needs the
// seed bytes to build the bloom filter and discover transactions.
let app_ctx = harness.state().current_app_context().clone();
let wallet_arc = {
let wallets = app_ctx.wallets().read().unwrap();
wallets.get(&seed_hash).cloned()
};
// Drop wallets lock before calling bootstrap/unlock methods
if let Some(wallet_arc) = wallet_arc {
{
let mut w = wallet_arc.write().unwrap();
if !w.is_open() {
w.wallet_seed
.open_no_password()
.unwrap_or_else(|e| panic!("Failed to unlock reused wallet: {}", e));
println!(" Wallet unlocked (no password)");
} else {
println!(" Wallet already open");
}
}
app_ctx.bootstrap_wallet_addresses(&wallet_arc);
app_ctx.handle_wallet_unlocked(&wallet_arc);
println!(" Wallet bootstrapped for SPV");
}
} else {
// 5-10. Create the funded wallet fixture via the model layer.
import_wallet_via_context(harness, ctx, &words);
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Frontend E2E wallet-reuse flow is unreachable under the current harness

The reuse branch in phase 0 cannot execute with the default frontend-E2E setup. create_e2e_harness() always constructs AppState::new_for_testing() (tests/frontend-e2e/helpers/harness.rs:43-50), and that constructor always uses the in-memory create_test_database() path (src/app.rs:272-283). Each test process therefore starts with an empty wallet database, so find_existing_e2e_wallet() at lines 24-33 always returns None, ctx.wallet_reused never becomes true, and the reuse-only unlock/bootstrap/cleanup paths in this phase, tests/frontend-e2e/helpers/harness.rs:352-356, and tests/frontend-e2e/phases/phase_07_teardown.rs:47-52 are dead code. Either remove the reuse path or switch this suite to a persistent test database mode when cross-run reuse is actually intended.

source: ['claude', 'codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `tests/frontend-e2e/phases/phase_00_setup.rs`:
- [SUGGESTION] lines 121-158: Frontend E2E wallet-reuse flow is unreachable under the current harness
  The reuse branch in phase 0 cannot execute with the default frontend-E2E setup. `create_e2e_harness()` always constructs `AppState::new_for_testing()` (`tests/frontend-e2e/helpers/harness.rs:43-50`), and that constructor always uses the in-memory `create_test_database()` path (`src/app.rs:272-283`). Each test process therefore starts with an empty wallet database, so `find_existing_e2e_wallet()` at lines 24-33 always returns `None`, `ctx.wallet_reused` never becomes `true`, and the reuse-only unlock/bootstrap/cleanup paths in this phase, `tests/frontend-e2e/helpers/harness.rs:352-356`, and `tests/frontend-e2e/phases/phase_07_teardown.rs:47-52` are dead code. Either remove the reuse path or switch this suite to a persistent test database mode when cross-run reuse is actually intended.

Comment thread src/app.rs Outdated
Comment on lines +275 to +307
@@ -283,6 +283,30 @@ impl AppState {
Self::new_inner(ctx, db, data_dir)
}

#[cfg(feature = "testing")]
fn testing_data_dir() -> Result<PathBuf, std::io::Error> {
if std::env::var(TESTING_USE_REAL_DATA_DIR_ENV)
.map(|value| matches!(value.as_str(), "1" | "true" | "TRUE" | "yes" | "YES"))
.unwrap_or(false)
{
return app_user_data_dir_path();
}

if let Ok(dir) = std::env::var("DASH_EVO_DATA_DIR") {
return Ok(PathBuf::from(dir));
}

let unique = SystemTime::now()
.duration_since(UNIX_EPOCH)
.map_err(std::io::Error::other)?
.as_nanos();
Ok(std::env::temp_dir().join(format!(
"dash-evo-tool-testing-{}-{}",
std::process::id(),
unique
)))
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: new_for_testing() creates unique temp data directories without any cleanup owner

new_for_testing() allocates a fresh data directory through testing_data_dir(), materializes it with ensure_data_dir_exists() and ensure_env_file(), and then stores only a PathBuf in AppState. There is no TempDir, drop guard, or other owner responsible for removing these dash-evo-tool-testing-* directories after each kittest or frontend-E2E harness instance is dropped. Because this constructor is now the common test entrypoint, repeated local and CI runs will steadily accumulate orphaned directories in the system temp area. Tie the directory to an owning RAII value or reuse an explicit test directory when persistence is desired.

source: ['claude', 'codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/app.rs`:
- [SUGGESTION] lines 275-307: `new_for_testing()` creates unique temp data directories without any cleanup owner
  `new_for_testing()` allocates a fresh data directory through `testing_data_dir()`, materializes it with `ensure_data_dir_exists()` and `ensure_env_file()`, and then stores only a `PathBuf` in `AppState`. There is no `TempDir`, drop guard, or other owner responsible for removing these `dash-evo-tool-testing-*` directories after each kittest or frontend-E2E harness instance is dropped. Because this constructor is now the common test entrypoint, repeated local and CI runs will steadily accumulate orphaned directories in the system temp area. Tie the directory to an owning RAII value or reuse an explicit test directory when persistence is desired.

Comment thread src/backend_task/error.rs
| TaskError::ShieldedNullifierSyncFailed { .. }
| TaskError::NetworkContextCreationFailed { .. } => TaskErrorCategory::Network,

TaskError::MustRetry(_)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: TaskError::MustRetry is unreachable in test_category()

The MustRetry branch in test_category() is never observed from the code that records task error categories. AppState::update() handles TaskResult::Error(TaskError::MustRetry(msg)) separately at src/app.rs:1147-1153, clears the stored category, and renders the message as success; only the generic TaskResult::Error(err) path at src/app.rs:1160-1162 calls err.test_category(). Keeping MustRetry in this classifier is low risk, but it misstates the runtime behavior and makes the test error taxonomy harder to reason about.

source: ['claude', 'codex']

Comment on lines +438 to +448
harness
.query_all_by_role(egui::accesskit::Role::TextInput)
.nth(12)
.expect("Alias text input must exist after the 12 mnemonic inputs")
.click();
harness.run_steps(5);
harness
.query_all_by_role(egui::accesskit::Role::TextInput)
.nth(12)
.expect("Alias text input must still exist after clicking")
.type_text(TEST_IMPORT_ALIAS);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: The alias field lookup is brittle because it depends on TextInput order

This test identifies the wallet alias field as the 13th TextInput, assuming the screen will always remain exactly "12 mnemonic inputs, then alias". The actual UI exposes the alias with the visible label Name: in src/ui/wallets/import_mnemonic_screen.rs:628-630, so any future text input inserted before it will silently redirect TEST_IMPORT_ALIAS into the wrong widget and fail later in a confusing way. The mnemonic boxes are inherently positional, but the alias field is not; this lookup should use a stable semantic affordance instead of widget order.

source: ['claude', 'codex']

Copy link
Copy Markdown
Collaborator Author

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Head 751021d resolves all prior blocking and suggestion-level infrastructure findings (TempDir-owned testing data dir, removal of unreachable wallet-reuse branch, RUST_MIN_STACK in CI). Three non-blocking items remain: a real but minor logic gap in the test-only task-error-category state lifetime in MessageBanner, plus two test-quality nitpicks (unreachable MustRetry arm in test_category() and positional nth(12) alias selector in the kittest import flow).

Reviewed commit: 751021d

🟡 1 suggestion(s)

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/ui/components/message_banner.rs`:
- [SUGGESTION] lines 453-486: Stored task-error category outlives its banner and can leak into the next error
  `last_global_task_error_category` is only cleared by `clear_all_global()` (line 464-468). The other banner removal paths leave it untouched: `clear_global_message()` (line 453-458) just filters out matching banners, and `show_global()` (line 485) drops auto-dismissed/closed banners via `retain_mut()` with no category cleanup. Because `set_global()` calls `maybe_set_default_global_task_error_category()` (line 345), and that helper only writes `Fatal` when the category `is_none()` (line 776-779), the previously-set category survives indefinitely after its banner disappears. Concretely: a backend error tagged `Network` is shown, the banner auto-dismisses, then a screen-side `MessageBanner::set_global(..., MessageType::Error)` runs — the frontend-E2E classifier observes `Network` instead of the intended `Fatal` default. The fix is symmetric: clear `last_global_task_error_category` whenever the last banner is removed (in `clear_global_message` when the result is empty, and in `show_global` after `retain_mut` empties the list), so the side-channel category state tracks banner lifetime.

Comment on lines 485 to 486
banners.retain_mut(|b| process_banner(ui, b) == BannerStatus::Visible);
set_banners(ui.ctx(), banners);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Stored task-error category outlives its banner and can leak into the next error

last_global_task_error_category is only cleared by clear_all_global() (line 464-468). The other banner removal paths leave it untouched: clear_global_message() (line 453-458) just filters out matching banners, and show_global() (line 485) drops auto-dismissed/closed banners via retain_mut() with no category cleanup. Because set_global() calls maybe_set_default_global_task_error_category() (line 345), and that helper only writes Fatal when the category is_none() (line 776-779), the previously-set category survives indefinitely after its banner disappears. Concretely: a backend error tagged Network is shown, the banner auto-dismisses, then a screen-side MessageBanner::set_global(..., MessageType::Error) runs — the frontend-E2E classifier observes Network instead of the intended Fatal default. The fix is symmetric: clear last_global_task_error_category whenever the last banner is removed (in clear_global_message when the result is empty, and in show_global after retain_mut empties the list), so the side-channel category state tracks banner lifetime.

source: ['codex']

🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/ui/components/message_banner.rs`:
- [SUGGESTION] lines 453-486: Stored task-error category outlives its banner and can leak into the next error
  `last_global_task_error_category` is only cleared by `clear_all_global()` (line 464-468). The other banner removal paths leave it untouched: `clear_global_message()` (line 453-458) just filters out matching banners, and `show_global()` (line 485) drops auto-dismissed/closed banners via `retain_mut()` with no category cleanup. Because `set_global()` calls `maybe_set_default_global_task_error_category()` (line 345), and that helper only writes `Fatal` when the category `is_none()` (line 776-779), the previously-set category survives indefinitely after its banner disappears. Concretely: a backend error tagged `Network` is shown, the banner auto-dismisses, then a screen-side `MessageBanner::set_global(..., MessageType::Error)` runs — the frontend-E2E classifier observes `Network` instead of the intended `Fatal` default. The fix is symmetric: clear `last_global_task_error_category` whenever the last banner is removed (in `clear_global_message` when the result is empty, and in `show_global` after `retain_mut` empties the list), so the side-channel category state tracks banner lifetime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants