Skip to content

Conversation

@sanity
Copy link
Collaborator

@sanity sanity commented Dec 10, 2025

Problem

The contract module had several testability and reliability issues:

  1. Production crash risk: Three todo!() calls in mod.rs (lines 66-67, 128-129, 191-193) would panic on fatal executor errors instead of handling them gracefully. This was a bug waiting to happen in production.

  2. Untestable init logic: Contract initialization state tracking was embedded inline in upsert_contract_state() across ~80 lines, making it impossible to unit test race conditions between PUT and UPDATE operations.

  3. Missing test infrastructure: No convenient way to create test fixtures for contract types (ContractKey, WrappedState, Parameters), which made writing new tests tedious.

This Solution

1. Replace todo!() with proper error handling

Added FatalExecutorError variant to ContractError enum. Fatal errors now propagate properly instead of crashing:

if err.is_fatal() {
    tracing::error!(%key, %err, "Fatal executor error during get query");
    return Err(ContractError::FatalExecutorError { key, error: err });
}

2. Extract ContractInitTracker state machine

Created new module executor/init_tracker.rs (~250 lines) that encapsulates all initialization tracking:

pub(crate) struct ContractInitTracker { ... }

impl ContractInitTracker {
    pub fn check_and_maybe_queue(&mut self, ...) -> InitCheckResult
    pub fn start_initialization(&mut self, key: ContractKey)
    pub fn complete_initialization(&mut self, key: &ContractKey) -> Option<InitCompletionInfo>
    pub fn fail_initialization(&mut self, key: &ContractKey) -> Option<usize>
}

This makes the state machine logic:

  • Testable: 12 unit tests verify all branches including race conditions
  • Readable: Clear API instead of scattered inline logic
  • Maintainable: Easy to add new states or behavior

3. Add test fixtures module

Added test_fixtures module with helpers:

  • make_contract_key() / make_contract_key_with_code()
  • make_state(), make_params(), make_delta()

4. Add comprehensive tests

  • 12 tests for ContractInitTracker (all state transitions, edge cases)
  • 10 tests for ExecutorError (error types, conversions, panic behavior)
  • 5 tests for test fixtures

Testing

  • All 342 existing tests pass
  • 27 new unit tests added
  • Pre-commit hooks pass (fmt, clippy, no TODO-MUST-FIX)

Files Changed

crates/core/src/contract/mod.rs                    # +FatalExecutorError, -todo!()
crates/core/src/contract/executor.rs               # +test_fixtures, +ExecutorError tests
crates/core/src/contract/executor/init_tracker.rs  # NEW: state machine + 12 tests
crates/core/src/contract/executor/runtime.rs       # Use ContractInitTracker

[AI-assisted - Claude]

@iduartgomez
Copy link
Collaborator

How long should a successful initialziation take potentially? Should we track this?

@sanity
Copy link
Collaborator Author

sanity commented Dec 10, 2025

Re: tracking initialization duration - this will be addressed as part of the stale initialization cleanup. The complete_initialization() method already returns init_duration, we just need to:

  1. Log it at info level for successful completions
  2. Escalate to warning if > 1 second
  3. Add periodic cleanup that warns when dropping stale entries (> 30 seconds old)

[AI-assisted - Claude]

/// Returns information about each stale initialization that was cleaned up.
/// This should be called periodically to prevent resource leaks from
/// initializations that never complete (e.g., due to bugs or crashes).
#[allow(dead_code)] // For future periodic cleanup integration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt we be using this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - cleanup is now called opportunistically at the start of each upsert_contract_state() call. Any initializations older than 30 seconds are purged and logged with a warning.

[AI-assisted - Claude]

@sanity sanity force-pushed the contract-testability branch from be9c12a to 343d86c Compare December 10, 2025 20:36
sanity and others added 3 commits December 11, 2025 16:15
- Replace todo!() panics with proper FatalExecutorError propagation
- Extract ContractInitTracker state machine for testable init logic
- Add test fixtures module for creating contract test data
- Add 27 new unit tests covering init tracking, error handling, and fixtures

The contract initialization tracking logic was previously embedded inline
in upsert_contract_state() making it difficult to test race conditions.
Now extracted into a dedicated ContractInitTracker with clear API.

Fatal executor errors now propagate properly instead of crashing via
todo!() - this was a production bug waiting to happen.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address review feedback from iduartgomez regarding potential resource leaks
in the ContractInitTracker.

Changes:
- Add SLOW_INIT_THRESHOLD (1s) and STALE_INIT_THRESHOLD (30s) constants
- Add cleanup_stale_initializations() method to purge stuck initializations
- Add StaleInitInfo struct to report cleaned up entries
- Log warning when initialization takes > 1 second
- Add initializing_count() for monitoring
- Add 4 new tests for cleanup functionality

The cleanup method is marked #[allow(dead_code)] as it needs to be
integrated with the event loop or a periodic task runner. The API is
ready for that integration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Call cleanup_stale_initializations() opportunistically at the start of
each upsert_contract_state() call. This ensures any initializations
stuck for > 30 seconds are cleaned up and logged with a warning.

Removes #[allow(dead_code)] annotations since the cleanup is now active.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@sanity sanity force-pushed the contract-testability branch from 56e0dcc to 1a139dc Compare December 11, 2025 22:15
@sanity sanity added this pull request to the merge queue Dec 12, 2025
@sanity sanity removed this pull request from the merge queue due to a manual request Dec 12, 2025
@sanity sanity merged commit 56177ab into main Dec 12, 2025
14 of 15 checks passed
@sanity sanity deleted the contract-testability branch December 12, 2025 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants