diff --git a/AGENTS.md b/AGENTS.md index 712e2e25..d76c2434 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -95,8 +95,8 @@ ## Rust Specific Guidance This repository is written in Rust and uses Cargo for building and dependency -management. Contributors should follow these best practices when working on the -project: +management. Contributors should follow these best practices when working on +the project: - Run `make fmt`, `make lint`, and `make test` before committing. These targets wrap `cargo fmt`, `cargo clippy`, and `cargo test` with the appropriate flags. diff --git a/docs/behavioural-testing-in-rust-with-cucumber.md b/docs/behavioural-testing-in-rust-with-cucumber.md index aa38b720..d5a18b2a 100644 --- a/docs/behavioural-testing-in-rust-with-cucumber.md +++ b/docs/behavioural-testing-in-rust-with-cucumber.md @@ -2,55 +2,113 @@ ## Part 1: The Philosophy and Practice of Behaviour-Driven Development (BDD) -Behaviour-Driven Development (BDD) is a software development process that has evolved from Test-Driven Development (TDD). While it incorporates testing, its primary focus is on fostering collaboration and communication among developers, quality assurance (QA) teams, business analysts, and product owners. This guide provides a comprehensive walkthrough of implementing BDD in Rust using the modern `cucumber` testing framework, focusing on practical application, best practices, and lessons learned from real-world use. +Behaviour-Driven Development (BDD) is a software development process that +evolved from Test-Driven Development (TDD). Although testing remains integral, +the primary focus is on collaboration and communication among developers, +QA teams, business analysts, and product owners. This guide walks through +implementing BDD in Rust with the modern `cucumber` testing framework, covering +practical techniques, best practices, and lessons from real-world projects. ### 1.1 Beyond Testing: BDD as a Collaborative Process -At its core, BDD is not merely a testing technique but a methodology for building a shared understanding of a system's behaviour.1 The central goal is to create a ubiquitous language that both technical and non-technical stakeholders can use to describe and agree upon software requirements.3 This process is centered around conversation; the discussions about how a feature should behave are the most valuable output of BDD.2 - -The tangible artifact of these conversations is a set of specifications written in a structured, natural language format. These specifications serve a dual purpose: they are human-readable documentation of the system's features, and they are also executable tests that verify the system's behaviour. This approach ensures that documentation and implementation cannot drift apart, creating a suite of "living documentation." - -The value of BDD is realized before a single line of implementation code is written. When a development team writes behaviour specifications in isolation, they are simply using a different syntax for their tests. The transformative potential of BDD is unlocked only when these specifications are co-created and validated through a collaborative process involving all team members. This ensures that what is built is precisely what the business needs, reducing ambiguity and rework. +At its core, BDD is not merely a testing technique but a methodology for +building a shared understanding of a system's behaviour.1 The central goal is to +create a ubiquitous language that both technical and non-technical stakeholders +can use to describe and agree upon software requirements.3 This process is +centred on conversation; the discussions about how a feature should behave are +the most valuable output of BDD.2 + +The tangible artifact of these conversations is a set of specifications written +in a structured, natural language format. These specifications serve a dual +purpose: they are human-readable documentation of the system's features, and +they are also executable tests that verify the system's behaviour. This approach +ensures that documentation and implementation cannot drift apart, creating a +suite of "living documentation." + +The value of BDD is realized before a single line of implementation code is +written. When a development team writes behaviour specifications in isolation, +they are simply using a different syntax for their tests. The transformative +potential of BDD is unlocked only when these specifications are co-created +and validated through a collaborative process involving all team members. +This ensures that what is built is precisely what the business needs, reducing +ambiguity and rework. ### 1.2 The Gherkin Language: Structuring Behaviour -To facilitate this process, BDD frameworks like Cucumber use a specific Domain-Specific Language (DSL) called Gherkin.5 Gherkin provides a simple, structured grammar for writing executable specifications in plain text files, typically with a - -`.feature` extension.6 Its syntax is designed to be intuitive and accessible, enabling clear communication across different project roles.3 - -A Gherkin document is line-oriented, with most lines beginning with a specific keyword. The primary keywords give structure and meaning to the specifications.7 - - - -

Keyword

Purpose

Simple Example

Feature

Provides a high-level description of a software feature and groups related scenarios.3

Feature: User Authentication

Scenario

Describes a single, concrete example of the feature's behaviour.3

Scenario: Successful login with valid credentials

Given

Sets the initial context or preconditions for a scenario.5

Given the user is on the login page

When

Describes the key action or event that triggers the behaviour being tested.1

When the user enters their username and password

Then

Specifies the expected outcome or result of the action.9

Then the user should be redirected to the dashboard

And, But

Used to add more steps to a Given, When, or Then clause without repetition, improving readability.3

And the user's name should be displayed

Background

Defines a set of steps that run before every Scenario in a Feature, used for common setup.6

Background: Given a registered user "Alice" exists

Scenario Outline

A template for running the same Scenario multiple times with different data sets.3

Scenario Outline: Login with various credentials

Examples

A data table that provides the values for a Scenario Outline.3

Examples: | username | password | outcome |

+To facilitate this process, BDD frameworks like Cucumber use a specific Domain- +Specific Language (DSL) called Gherkin.5 Gherkin provides a simple, structured +grammar for writing executable specifications in plain text files with a +`.feature` extension.6 Its syntax is designed to be intuitive and accessible, +enabling clear communication across different project roles.3 + +A Gherkin document is line-oriented, with most lines beginning with a specific +keyword. The primary keywords give structure and meaning to the specifications.7 + +| Keyword | Purpose | Simple Example | +| ---------------- | --------------------------------------------------------------------------------------------------- | --------------------------------------------------- | +| Feature | Provides a high-level description of a software feature and groups related scenarios.3 | Feature: User Authentication | +| Scenario | Describes a single, concrete example of the feature's behaviour.3 | Scenario: Successful login with valid credentials | +| Given | Sets the initial context or preconditions for a scenario.5 | Given the user is on the login page | +| When | Describes the key action or event that triggers the behaviour being tested.1 | When the user enters their username and password | +| Then | Specifies the expected outcome or result of the action.9 | Then the user should be redirected to the dashboard | +| And, But | Used to add more steps to a Given, When, or Then clause without repetition, improving readability.3 | And the user's name should be displayed | +| Background | Defines a set of steps that run before every Scenario in a Feature, used for common setup.6 | Background: Given a registered user "Alice" exists | +| Scenario Outline | A template for running the same Scenario multiple times with different data sets.3 | Scenario Outline: Login with various credentials | +| Examples | A data table that provides the values for a Scenario Outline.3 | username | password | outcome | ### 1.3 The Given-When-Then Idiom: A Universal Test Pattern -For developers, the `Given-When-Then` structure is not an entirely new concept. It is a highly effective reformulation of well-established testing patterns that many are already familiar with from unit testing.5 The most common parallel is the +For developers, the `Given-When-Then` structure is not an entirely new concept. +It is a highly effective reformulation of well-established testing patterns +that many are already familiar with from unit testing.5 The most common parallel +is the **Arrange-Act-Assert (AAA)** pattern, conceptualized by Bill Wake. -**Arrange-Act-Assert (AAA)** pattern, conceptualized by Bill Wake. +- **Given** corresponds to **Arrange**: This phase sets up the world. It + establishes all preconditions, initializes objects, and brings the system + under test (SUT) to the specific state required for the test. In Gherkin, this + is where the team describes the context before the behaviour begins.5 -- **Given** corresponds to **Arrange**: This phase sets up the world. It establishes all preconditions, initializes objects, and brings the system under test (SUT) to the specific state required for the test. In Gherkin, this is where you describe the context before the behaviour begins.5 +- **When** corresponds to **Act**: This is the single, pivotal action performed + on the SUT. It's the event or trigger whose consequences are being specified. + This phase should ideally contain only one primary action.5 -- **When** corresponds to **Act**: This is the single, pivotal action performed on the SUT. It's the event or trigger whose consequences are being specified. This phase should ideally contain only one primary action.5 +- **Then** corresponds to **Assert**: This phase verifies the outcome. After + the action in the `When` step, the `Then` steps check that the SUT's state has + changed as expected. These steps should contain the assertions and should be + free of side effects.5 -- **Then** corresponds to **Assert**: This phase verifies the outcome. After the action in the `When` step, the `Then` steps check that the SUT's state has changed as expected. These steps should contain the assertions and should be free of side effects.5 - -This connection demystifies BDD. It is not an alien methodology but a structured, collaborative application of a pattern developers already use. The power of Gherkin lies in making the Arrange-Act-Assert pattern legible and verifiable by non-programmers, thereby turning a simple test into a piece of shared, executable documentation. +This connection demystifies BDD. It is not an alien methodology but a +structured, collaborative application of a pattern developers already use. +The power of Gherkin lies in making the Arrange-Act-Assert pattern legible and +verifiable by non-programmers, thereby turning a simple test into a piece of +shared, executable documentation. ## Part 2: Project Setup: Your First Rust Cucumber Test -Setting up a Rust project to use the `cucumber` crate requires a few specific configurations in `Cargo.toml` and a well-defined directory structure. This section walks through creating a minimal, runnable test suite from scratch. +Setting up a Rust project to use the `cucumber` crate requires a few specific +configurations in `Cargo.toml` and a well-defined directory structure. This +section walks through creating a minimal, runnable test suite from scratch. ### 2.1 Configuring `Cargo.toml` -To begin, you need to add the necessary dependencies and configure a custom test runner. The `cucumber` crate is async-native and requires an async runtime to execute tests; `tokio` is the most common choice and is used throughout the official documentation.12 - -The key configuration step is defining a `[[test]]` target in `Cargo.toml`. This tells Cargo to build a specific test executable. Setting `harness = false` is crucial; it disables Rust's default test harness, allowing the `cucumber` runner to take control of the process and print its own formatted output to the console.13 - - - -

Section

Key

Value / Description

[dependencies]

tokio

The async runtime. Required with features like macros and rt-multi-thread.13

[dev-dependencies]

cucumber

The main testing framework crate.16

[dev-dependencies]

futures

Often needed for async operations, particularly with older examples or for specific combinators.18

[[test]]

name

The name of your test runner file (e.g., "cucumber"). This must match the filename in tests/.

[[test]]

harness

Must be set to false to allow cucumber to manage test execution and output.14

+To begin, the necessary dependencies must be added and a custom test runner +configured. The `cucumber` crate is async-native and requires an async runtime +to execute tests; `tokio` is the most common choice and is used throughout the +official documentation.12 + +The key configuration step is defining a `[[test]]` target in `Cargo.toml`. +This tells Cargo to build a specific test executable. Setting `harness = false` +is crucial; it disables Rust's default test harness, allowing the `cucumber` +runner to take control of the process and print its own formatted output to +the console.13 + +| Section | Key | Value / Description | +| ------------------ | -------- | -------------------------------------------------------------------------------------------------- | +| [dependencies] | tokio | The async runtime. Required with features like macros and rt-multi-thread.13 | +| [dev-dependencies] | cucumber | The main testing framework crate.16 | +| [dev-dependencies] | futures | Often needed for async operations, particularly with older examples or for specific combinators.18 | +| [[test]] | name | The name of the test-runner file (e.g., "cucumber"). This must match the filename in tests/. | +| [[test]] | harness | Must be set to `false` so cucumber can manage test execution and output.14 | Here is a complete `Cargo.toml` configuration snippet: @@ -74,7 +132,9 @@ harness = false ### 2.2 Directory Structure and File Organization -A well-organized project structure is vital for maintainable BDD tests. The standard convention separates the human-readable feature specifications from the Rust implementation code.18 +A well-organized project structure is vital for maintainable BDD tests. The +standard convention separates the human-readable feature specifications from the +Rust implementation code.18 ```plaintext . @@ -90,19 +150,32 @@ A well-organized project structure is vital for maintainable BDD tests. The stan └── calculator_steps.rs ``` -This structure physically embodies the BDD philosophy of separating concerns. The `.feature` files in `tests/features/` define *what* the system should do. These can be read, written, and reviewed by non-technical stakeholders. The Rust files in `tests/steps/` define *how* those behaviours are tested. This clear boundary is a cornerstone of effective BDD practice and is strongly recommended.14 +This structure physically embodies the BDD philosophy of separating concerns. +The `.feature` files in `tests/features/` define *what* the system should do. +These can be read, written, and reviewed by non-technical stakeholders. The +Rust files in `tests/steps/` define *how* those behaviours are tested. This +clear boundary is a cornerstone of effective BDD practice and is strongly +recommended.14 ### 2.3 The `World` Object: Managing Scenario State -The `World` is the most critical concept in `cucumber-rs`. It is a user-defined struct that encapsulates all the shared state for a single test scenario.16 Each time a scenario begins, a new instance of the +The `World` is the most critical concept in `cucumber-rs`. It is a user-defined +struct that encapsulates all the shared state for a single test scenario.16 +Each time a scenario begins, a new instance of the `World` is created. This +instance is then passed mutably to each step (`Given`, `When`, `Then`) within +that scenario.18 -`World` is created; this instance is then passed mutably to each step (`Given`, `When`, `Then`) within that scenario.18 +This design provides a powerful mechanism for test isolation. Because each +scenario gets its private `World` instance, there is no risk of state leaking +from one test to another, even when tests are run concurrently.20 This is a +significant advantage of the Rust implementation, leveraging the language's +ownership model to solve a common and difficult problem in test automation.21 -This design provides a powerful mechanism for test isolation. Because each scenario gets its own private `World` instance, there is no risk of state leaking from one test to another, even when tests are run concurrently.20 This is a significant advantage of the Rust implementation, leveraging the language's ownership model to solve a common and difficult problem in test automation.21 +To create a `World`, define a struct and derive `cucumber::World`. It is also +conventional to derive `Debug` and `Default`.12 -To create a `World`, you define a struct and derive `cucumber::World`. It's also conventional to derive `Debug` and `Default`.12 - -**Worked Example:** For a simple calculator application, the `World` might look like this: +**Worked Example:** For a simple calculator application, the `World` might look +like this: ```rust // In a shared location, e.g., tests/cucumber.rs @@ -116,13 +189,20 @@ pub struct CalculatorWorld { } ``` -By default, `cucumber` will instantiate the `World` using `Default::default()`. If your `World` requires more complex initialization (e.g., starting a mock server or connecting to a test database), you can provide a custom constructor function using the `#[world(init =...)]` attribute.20 +By default, `cucumber` will instantiate the `World` using `Default::default()`. +If a `World` requires more complex initialization (for example, starting a mock +server or connecting to a test database), provide a custom constructor function +using the `#[world(init = ...)]` attribute.20 ### 2.4 Your First `main` Test Runner -With the `harness = false` setting in `Cargo.toml`, you must provide your own `main` function in the test target file (e.g., `tests/cucumber.rs`). This function serves as the entry point for the test suite. +With the `harness = false` setting in `Cargo.toml`, supply a custom `main` +function in the test target file (for example, `tests/cucumber.rs`). This +function acts as the entry point for the test suite. -Since `cucumber-rs` is async, the `main` function must be an `async fn` and is typically annotated with `#[tokio::main]`.13 The core of the function is a single line that invokes the test runner: +Because `cucumber-rs` is async, the `main` function must be an `async fn` and +is typically annotated with `#[tokio::main]`.13 The core of the function is a +single line that invokes the test runner: `YourWorld::run("path/to/features").await`.16 @@ -169,19 +249,36 @@ async fn main() { } ``` -At this point, you have a complete, albeit empty, test suite. Running `cargo test --test cucumber` will compile the runner, which will then discover `.feature` files in `tests/features`, find no matching steps, and report them as undefined. +At this point, there is a complete, albeit empty, test suite. Running `cargo +test --test cucumber` will compile the runner, which will then discover +`.feature` files in `tests/features`, find no matching steps, and report them +as undefined. ## Part 3: Writing Step Definitions: Connecting Gherkin to Rust -Step definitions are the "glue" that connects the human-readable Gherkin steps in your `.feature` files to executable Rust code. The `cucumber` crate provides procedural macros to make this connection seamless and type-safe. +Step definitions are the "glue" that connects the human-readable Gherkin steps +in `.feature` files to executable Rust code. The `cucumber` crate provides +procedural macros to make this connection seamless and type-safe. ### 3.1 The `#[given]`, `#[when]`, and `#[then]` Macros -The core of step definition is a set of attribute macros: `#[given]`, `#[when]`, and `#[then]`.12 You apply these macros to Rust functions. When the test runner encounters a Gherkin step, it looks for a function annotated with the corresponding macro and a matching text pattern. +The core of step definition is a set of attribute macros: `#[given]`, `#[when]`, +and `#[then]`.12 You apply these macros to Rust functions. When the test +runner encounters a Gherkin step, it looks for a function annotated with the +corresponding macro and a matching text pattern. -Each step definition function must accept a mutable reference to your `World` struct as its first argument (e.g., `world: &mut CalculatorWorld`).18 This allows the function to modify the shared state for the current scenario. +Each step definition function must accept a mutable reference to the `World` +struct as its first argument (for example, `world: &mut CalculatorWorld`).18 +This affords the function the ability to modify the shared state for the current +scenario. -A key design choice in `cucumber-rs` is the strict separation of these step types. A function marked with `#[then]` cannot be used to satisfy a `Given` step in a feature file.20 This is a deliberate feature, not a limitation. It encourages developers to maintain the clean Arrange-Act-Assert structure by preventing them from accidentally using assertion logic during setup, or performing actions during verification. This discipline leads to more readable, robust, and maintainable tests. +A key design choice in `cucumber-rs` is the strict separation of these step +types. A function marked with `#[then]` cannot be used to satisfy a `Given` +step in a feature file.20 This is a deliberate feature, not a limitation. +It encourages developers to maintain the clean Arrange-Act-Assert structure +by preventing them from accidentally using assertion logic during setup, or +performing actions during verification. This discipline leads to more readable, +robust, and maintainable tests. **Worked Example:** @@ -215,33 +312,57 @@ fn check_result(world: &mut CalculatorWorld, expected: i32) { ### 3.2 Capturing Arguments: Regex vs. Cucumber Expressions -To make steps dynamic, you need to capture parts of the Gherkin text and pass them as arguments to your Rust functions. `cucumber-rs` supports two mechanisms for this: regular expressions and Cucumber Expressions.16 - -- **Cucumber Expressions (**`expr = "..."`**)**: This is the recommended default. They are less powerful than regex but are more readable and explicitly designed for this purpose. They provide built-in parsing for common types like `{int}`, `{float}`, `{word}`, and `{string}` (in quotes).16 The framework automatically handles parsing the captured string into the corresponding Rust type in your function signature. - -- **Regular Expressions (**`regex = "..."`**)**: For more complex matching needs, you can use full regex syntax. Capture groups `(...)` in the regex correspond to function arguments.18 The framework will still attempt to parse the captured - - `&str` into the function's argument type. It's a best practice to anchor your regex with `^` and `$` to ensure the entire step text is matched, preventing partial or ambiguous matches.18 - - - -

Feature

Cucumber Expression Example

Regex Example

Recommendation

Basic Capture

expr = "I have {int} cucumbers"

regex = r"^I have (\d+) cucumbers$"

Use expressions for clarity.

Type Conversion

{int} automatically maps to i32, u64, etc.

Capture group (\d+) is a &str, parsed to the function's numeric type.

Expressions are more direct and less error-prone.

Readability

High. The intent is clear from the expression itself.

Medium to Low. Regex can become complex and hard to read.

Expressions are superior for collaboration.

Flexibility

Limited to its defined syntax (e.g., cannot match complex patterns).

High. Can match almost any text pattern.

Use Regex as a power tool when expressions are insufficient.

+To make steps dynamic, captured fragments of the Gherkin text must be passed +as arguments to the corresponding Rust functions. `cucumber-rs` supports two +mechanisms for this: regular expressions and Cucumber Expressions.16 + +- **Cucumber Expressions (**`expr = "..."`**)**: This is the recommended + default. They are less powerful than regex but are more readable and + explicitly designed for this purpose. They provide built-in parsing for + common types like `{int}`, `{float}`, `{word}`, and `{string}` (in quotes).16 + The framework automatically handles parsing the captured string into the + corresponding Rust type in your function signature. + +- **Regular Expressions (**`regex = "..."`**)**: For more complex matching + needs, full regex syntax can be used. Capture groups `(...)` in the regex + correspond to function arguments.18 The framework will still attempt to parse + the captured `&str` into the function's argument type. It is a best practice + to anchor the regex with `^` and `$` to ensure the entire step text is + matched, preventing partial or ambiguous matches.18. + +| Feature | Cucumber Expression Example | Regex Example | Recommendation | +| --------------- | -------------------------------------------------------------------- | ---------------------------------------------------------------------- | ------------------------------------------------------------ | +| Basic Capture | expr = "I have {int} cucumbers" | regex = r"^I have (\\d+) cucumbers$" | Use expressions for clarity. | +| Type Conversion | {int} automatically maps to i32, u64, etc. | Capture group (\\d+) is a &str, parsed to the function's numeric type. | Expressions are more direct and less error-prone. | +| Readability | High. The intent is clear from the expression itself. | Medium to Low. Regex can become complex and hard to read. | Expressions are superior for collaboration. | +| Flexibility | Limited to its defined syntax (e.g., cannot match complex patterns). | High. Can match almost any text pattern. | Use Regex as a power tool when expressions are insufficient. | ### 3.3 Handling Test Outcomes: `assert!` and `Result` -The `Then` steps are where you verify the system's state. The most straightforward way to do this is with Rust's standard assertion macros, like `assert_eq!` or `assert!`.16 If an assertion fails, the thread will panic, and - +The `Then` steps are where you verify the system's state. The most +straightforward way to do this is with Rust's standard assertion macros, like +`assert_eq!` or `assert!`.16 If an assertion fails, the thread will panic, and `cucumber` will mark the step as failed. -However, a more idiomatic and powerful approach is to have your step functions return a `Result`.20 A step that returns +However, a more idiomatic and powerful approach is to have your step functions +return a `Result`.20 A step that returns -`Ok(())` passes, while one that returns an `Err(...)` fails. This has two major benefits: +`Ok(())` passes, while one that returns an `Err(...)` fails. This has two major +benefits: -1. **Cleaner Code:** It allows you to use the `?` operator to propagate errors from your application logic or from parsing steps, leading to more concise and readable code. +1. **Cleaner Code:** Using the `?` operator propagates errors from the + application logic or from parsing steps, leading to more concise and readable + code. -2. **Richer Failure Messages:** A panic from an `assert!` often gives a limited error message. By returning a custom error type that implements `std::error::Error`, you can provide detailed, contextual information about *why* the test failed. This is invaluable for debugging. +2. **Richer Failure Messages:** A panic from an `assert!` often gives a limited + error message. Returning a custom error type that implements + `std::error::Error` provides detailed, contextual information about *why* the + test failed. This is invaluable for debugging. -Rust's error handling philosophy is built around the `Result` enum for recoverable errors, and a test failure is a recoverable error from the test runner's perspective.22 Embracing this pattern in your step definitions is a significant best practice. +Rust's error handling philosophy is built around the `Result` enum for +recoverable errors, and a test failure is a recoverable error from the test +runner's perspective.22 Embracing this pattern in your step definitions is a +significant best practice. **Worked Example (using** `Result`**):** @@ -274,17 +395,26 @@ fn check_status(world: &mut ApiWorld, expected_status: u16) -> Result<(), TestEr } ``` -This approach provides a clear, structured error that is much more informative than a simple assertion failure. +This approach provides a clear, structured error that is much more informative +than a simple assertion failure. ## Part 4: Advanced Gherkin & Step Definition Techniques -As test suites grow in complexity, you will need more advanced Gherkin features to keep them maintainable and expressive. This section covers techniques for data-driven testing, handling complex inputs, and managing asynchronous operations. +As test suites grow in complexity, more advanced Gherkin features become +necessary to keep them maintainable and expressive. This section covers +techniques for data-driven testing, handling complex inputs, and managing +asynchronous operations. ### 4.1 Data-Driven Testing: `Scenario Outline` and `Examples` -Often, you want to test the same behaviour with a variety of different inputs and expected outputs. Writing a separate `Scenario` for each case would be highly repetitive. Gherkin solves this with the `Scenario Outline` keyword.3 +Often, the same behaviour must be tested with various inputs and expected +outputs. Writing a separate `Scenario` for each case would be highly repetitive. +Gherkin solves this with the `Scenario Outline` keyword.3 -A `Scenario Outline` acts as a template. You write the steps using placeholders enclosed in angle brackets, like `` or ``. Below the outline, you provide an `Examples` table. Each row in this table represents a concrete run of the scenario, with the column headers matching the placeholders in the steps.11 +A `Scenario Outline` acts as a template. You write the steps using placeholders +enclosed in angle brackets, like `` or ``. Below the outline, you +provide an `Examples` table. Each row in this table represents a concrete run of +the scenario, with the column headers matching the placeholders in the steps.11 **Worked Example:** @@ -310,15 +440,27 @@ Feature: Basic arithmetic | 0 | -20 | -20 | ``` -This single `Scenario Outline` will generate and run four separate tests, each with its own `World` instance, providing excellent test coverage with minimal boilerplate. +This single `Scenario Outline` will generate and run four separate tests, each +with its own `World` instance, providing excellent test coverage with minimal +boilerplate. ### 4.2 Passing Structured Data with Data Tables -Sometimes, a step requires a more complex data structure than can be passed with simple arguments. For example, setting up an initial inventory or providing a list of users. For this, Gherkin provides **Data Tables**.23 +Sometimes, a step requires a more complex data structure than can be passed with +simple arguments. For example, setting up an initial inventory or providing a +list of users. For this, Gherkin provides **Data Tables**.23 + +A Data Table is a pipe-delimited table placed directly after a Gherkin +step. To access this table in a Rust step definition, add a `step: +&cucumber::gherkin::Step` argument to the function. The table can then be +accessed via `step.table` (which is an `Option`).23 -A Data Table is a pipe-delimited table placed directly after a Gherkin step. To access this table in your Rust step definition, you must add a `step: &cucumber::gherkin::Step` argument to your function. The table can then be accessed via `step.table` (which is an `Option
`).23 +Data tables encourage a more declarative style of testing. Instead of writing +a series of imperative steps to build up a state (e.g., "Given I add a user +'Alice'", "And I set her role to 'Admin'"), the entire state can be described in +a single, readable table.25. -Data tables encourage a more declarative style of testing. Instead of writing a series of imperative steps to build up a state (e.g., "Given I add a user 'Alice'", "And I set her role to 'Admin'"), you can describe the entire state in a single, readable table.25 This makes the +This makes the `Given` steps more concise and focused on the initial context. @@ -362,7 +504,9 @@ fn given_items_in_warehouse(world: &mut InventoryWorld, step: &Step) { ### 4.3 Managing Common Preconditions with `Background` -If every scenario in a `.feature` file shares the same set of initial `Given` steps, you can use the `Background` keyword to reduce duplication.6 The steps listed under +If every scenario in a `.feature` file shares the same set of initial `Given` +steps, you can use the `Background` keyword to reduce duplication.6 The steps +listed under `Background` will be executed before *each* `Scenario` in that feature file.26 @@ -384,17 +528,30 @@ Feature: User account management Then the user should be logged out ``` -**Pitfall Warning:** Use `Background` with caution. If it becomes too long or is not relevant to every single scenario, it can make the tests harder to understand by hiding essential context. If only some scenarios share setup, it is better to create a dedicated `Given` step and repeat it.21 +**Pitfall Warning:** Use `Background` with caution. If it becomes too long +or is not relevant to every single scenario, it can make the tests harder to +understand by hiding essential context. If only some scenarios share setup, it +is better to create a dedicated `Given` step and repeat it.21 ### 4.4 Asynchronous Operations: Testing in the Real World -Modern Rust applications, especially those involving networking, databases, or file I/O, are heavily asynchronous. The `cucumber-rs` crate is designed with this in mind, making it an excellent choice for integration and end-to-end (E2E) testing. +Modern Rust applications, especially those involving networking, databases, or +file I/O, are heavily asynchronous. The `cucumber-rs` crate is designed with +this in mind, making it an excellent choice for integration and end-to-end (E2E) +testing. -Step definition functions can be declared as `async fn`.12 Inside these functions, you can +Step definition functions can be declared as `async fn`.12 Inside these +functions, any `Future` – such as a database query or HTTP request – can be +`.await`-ed. This requires that your test runner’s `main` function is powered by +an async runtime like `tokio`.13 -`.await` any `Future`, such as a database query or an HTTP request. This requires that your test runner's `main` function is powered by an async runtime like `tokio`.13 - -The async-first design of `cucumber-rs` is one of its most powerful features. It allows for writing tests that accurately reflect the asynchronous nature of the application under test. Furthermore, because `cucumber` can run scenarios concurrently by default, I/O-bound tests can execute in parallel, dramatically reducing the total runtime of your test suite compared to traditional synchronous, serial test runners.20 This makes it feasible to run comprehensive integration test suites as part of your regular development workflow. +The async-first design of `cucumber-rs` is one of its most powerful features. +It allows for writing tests that accurately reflect the asynchronous nature of +the application under test. Furthermore, because `cucumber` can run scenarios +concurrently by default, I/O-bound tests can execute in parallel, dramatically +reducing the total runtime of the test suite compared with traditional +synchronous, serial test runners.20 This makes it feasible to run comprehensive +integration test suites as part of your regular development workflow. **Worked Example (Async Step):** @@ -419,11 +576,16 @@ async fn request_user_profile(world: &mut ApiWorld, username: String) { ## Part 5: Worked Example: Behavioural Testing for a REST API -This section synthesizes the concepts from previous parts into a complete, practical example: testing a simple key-value store REST API. This demonstrates a realistic use case for `cucumber-rs` as an integration testing tool, leveraging `reqwest` for HTTP requests and `wiremock-rs` for creating an isolated, in-process mock server. +This section synthesizes the concepts from previous parts into a complete, +practical example: testing a simple key-value store REST API. This demonstrates +a realistic use case for `cucumber-rs` as an integration testing tool, +leveraging `reqwest` for HTTP requests and `wiremock-rs` for creating an +isolated, in-process mock server. ### 5.1 Defining the Feature (`kv_store.feature`) -First, we define the desired behaviour in a Gherkin `.feature` file. This file serves as the executable specification for our API. +First, the desired behaviour is defined in a Gherkin `.feature` file. This file +serves as the executable specification for the API. ```gherkin # In tests/features/kv_store.feature @@ -448,9 +610,14 @@ Feature: Key-Value Store API ### 5.2 Designing the `World` for API Testing -The `World` for this test suite needs to manage the state of the HTTP client and the mock server. It will also store the last API response so that `Then` steps can perform assertions on it. +The `World` for this test suite needs to manage the state of the HTTP client and +the mock server. It will also store the last API response so that `Then` steps +can perform assertions on it. -A crucial aspect of this design is that the mock server is part of the `World`. This means each scenario gets its own, completely isolated mock server instance running on a random port. This is the key to enabling fast, reliable, and parallelizable integration tests.20 +A crucial aspect of this design is that the mock server is part of the `World`. +This means each scenario gets its own, completely isolated mock server instance +running on a random port. This is the key to enabling fast, reliable, and +parallelizable integration tests.20 ```rust // In tests/cucumber.rs @@ -484,19 +651,26 @@ async fn main() { } ``` -Note the use of `#` and the `async fn new()` implementation. This is necessary because starting the `MockServer` is an async operation and cannot be done in a `Default::default()` implementation.20 +Note the use of `#` and the `async fn new()` implementation. This is necessary +because starting the `MockServer` is an async operation and cannot be done in a +`Default::default()` implementation.20 ### 5.3 Mocking Dependencies with `wiremock-rs` -`wiremock-rs` is a pure-Rust library for mocking HTTP-based APIs.27 It allows you to define expectations (e.g., "expect a GET request to - -`/foo`") and specify responses. This is done in the `Given` steps to set up the state of the external world before the `When` action occurs. +`wiremock-rs` is a pure-Rust library for mocking HTTP-based APIs.27 Expectations +can be defined (for example, "expect a GET request to `/foo`") and specify +responses. This is done in the `Given` steps to set up the state of the external +world before the `When` action occurs. -Using an in-process mock server like `wiremock-rs` is a superior pattern for integration testing. It avoids the complexity and slowness of managing external services or Docker containers, leading to faster and more reliable test execution.27 +Using an in-process mock server like `wiremock-rs` is a superior pattern for +integration testing. It avoids the complexity and slowness of managing external +services or Docker containers, leading to faster and more reliable test +execution.27 ### 5.4 Implementing the API Step Definitions -The step definitions will use the `server` from the `ApiWorld` to set up mocks and the `client` to make requests. +The step definitions will use the `server` from the `ApiWorld` to set up mocks +and the `client` to make requests. ```rust // In tests/steps/api_steps.rs @@ -558,15 +732,22 @@ async fn check_response_body(world: &mut ApiWorld, expected_body: String) { } ``` -This complete example demonstrates the full BDD cycle: defining behaviour, setting up an isolated test environment in the `World`, mocking dependencies, executing actions, and asserting outcomes, all within Rust's powerful async ecosystem. +This complete example demonstrates the full BDD cycle: defining behaviour, +setting up an isolated test environment in the `World`, mocking dependencies, +executing actions, and asserting outcomes, all within Rust's powerful async +ecosystem. ## Part 6: Best Practices for Scalable and Maintainable Test Suites -As a project grows, so does its test suite. Adhering to best practices is essential to ensure that your Cucumber tests remain a valuable asset rather than a maintenance burden. +As a project grows, so does its test suite. Adhering to best practices is +essential to ensure that your Cucumber tests remain a valuable asset rather than +a maintenance burden. ### 6.1 The "One-to-One" Rule: One Scenario, One Behaviour -A fundamental principle for writing clean Gherkin is that **each scenario should test exactly one behaviour**.6 A common anti-pattern is to chain multiple actions and outcomes within a single scenario, often indicated by multiple +A fundamental principle for writing clean Gherkin is that **each scenario +should test exactly one behaviour**.6 A common anti-pattern is to chain multiple +actions and outcomes within a single scenario, often indicated by multiple `When-Then` pairs. @@ -581,7 +762,9 @@ Scenario: User manages their cart Then the final price should be lower ``` -This scenario is testing two distinct behaviours: updating quantity and applying a discount. If the second `Then` fails, it's unclear if the issue is with the discount logic or if the state from the first action was incorrect. +This scenario is testing two distinct behaviours: updating quantity and applying +a discount. If the second `Then` fails, it's unclear if the issue is with the +discount logic or if the state from the first action was incorrect. **Best Practice:** Split this into two focused scenarios. @@ -597,25 +780,39 @@ Scenario: Applying a valid discount code reduces the final price Then the final price should be lower ``` -This approach isolates failures, improves clarity, and makes each scenario an independent specification of a single rule.6 +This approach isolates failures, improves clarity, and makes each scenario an +independent specification of a single rule.6 ### 6.2 Declarative vs. Imperative Steps: Finding the Balance -The most maintainable test suites favor a **declarative** style over an **imperative** one. +The most maintainable test suites favor a **declarative** style over an +**imperative** one. -- **Imperative steps** describe *how* an action is performed, often coupling the test to specific UI elements or implementation details (e.g., "When I click the 'submit-button'"). This makes tests brittle; a small UI change can break many tests.25 +- **Imperative steps** describe *how* an action is performed, often coupling the + test to specific UI elements or implementation details (e.g., "When I click + the 'submit-button'"). This makes tests brittle; a small UI change can break + many tests.25 -- **Declarative steps** describe *what* the user is trying to achieve, focusing on intent and behaviour (e.g., "When I submit my registration"). +- **Declarative steps** describe *what* the user is trying to achieve, focusing + on intent and behaviour (e.g., "When I submit my registration"). -The collection of your step definitions should evolve into a Domain-Specific Language (DSL) for your application.3 A step like +The collective set of step definitions should evolve into a project-specific +Domain-Specific Language (DSL).3 A step like -`When I register my account` is declarative. Internally, its Rust implementation might perform several imperative actions (fill form fields, click a button, wait for an API response), but these details are abstracted away from the `.feature` file. This abstraction is the key to creating a robust and maintainable test suite that communicates business value. +`When I register my account` is declarative. Internally, its Rust implementation +might perform several imperative actions (fill form fields, click a button, wait +for an API response), but these details are abstracted away from the `.feature` +file. This abstraction is the key to creating a robust and maintainable test +suite that communicates business value. ### 6.3 `World` Management in Large Projects -In large projects, the `World` struct can become a "god object," accumulating dozens of fields and becoming difficult to manage. To avoid this, use composition. +In large projects, the `World` struct can become a "god object," accumulating +dozens of fields and becoming difficult to manage. To avoid this, use +composition. -**Best Practice:** Instead of a monolithic `World`, structure it with smaller, focused context structs. +**Best Practice:** Instead of a monolithic `World`, structure it with smaller, +focused context structs. ```rust // Less maintainable @@ -626,7 +823,7 @@ pub struct MonolithicWorld { api_client: ApiClient, db_connection: DbPool, last_api_response: Option, - //... and 20 more fields + //… and 20 more fields } // More maintainable @@ -646,96 +843,154 @@ pub struct ComposedWorld { } ``` -This approach organizes state logically and makes the `World` easier to reason about. For complex setup, always prefer a custom constructor with `#[world(init =...)]` over trying to force everything into `Default`.20 +This approach organizes state logically and makes the `World` easier to reason +about. For complex setup, always prefer a custom constructor with `#[world(init +=...)]` over trying to force everything into `Default`.20 ### 6.4 Organizing Features and Steps -Just as you organize your application code, you must organize your test code. +Test code should be organized in the same way as application code. -- **Feature Files:** Group `.feature` files by application capability or user story.26 For example, +- **Feature Files:** Group `.feature` files by application capability or user + story.26 For example, `tests/features/authentication/`, `tests/features/product_catalog/`, etc. -- **Step Definitions:** Mirror the feature file structure in your `tests/steps/` directory. Create a Rust module for each feature area (e.g., `tests/steps/authentication_steps.rs`, `tests/steps/catalog_steps.rs`). This prevents having a single, massive step definition file and makes it easier to find the code corresponding to a Gherkin step. +- **Step Definitions:** Mirror the feature file structure in your `tests/steps/ + ` directory. Create a Rust module for each feature area (e.g., `tests/steps/ + authentication_steps.rs`, `tests/steps/catalog_steps.rs`). This prevents + having a single, massive step definition file and makes it easier to find the + code corresponding to a Gherkin step. ## Part 7: Common Pitfalls and Troubleshooting -Even with best practices, developers can encounter common issues when implementing BDD. Recognizing these pitfalls is the first step to avoiding them. +Even with best practices, developers can encounter common issues when +implementing BDD. Recognizing these pitfalls is the first step to avoiding them. ### 7.1 State Leakage and Concurrency Issues -**Pitfall:** Sharing state between scenarios using `static` variables, global state, or external files. This is a primary cause of flaky, non-deterministic tests, especially because `cucumber-rs` runs scenarios concurrently by default.20 - -**Solution:** The `World` object is the *only* sanctioned place for scenario state. Treat each scenario as if it could be running at the same time as any other. If you must interact with a shared, singular resource (like a physical hardware device), you must tag the relevant scenarios with `@serial`. This forces them to run one at a time.20 However, overuse of +**Pitfall:** Sharing state between scenarios using `static` variables, global +state, or external files. This is a primary cause of flaky, non-deterministic +tests, especially because `cucumber-rs` runs scenarios concurrently by +default.20 -`@serial` is often a sign of a poor test design and negates the performance benefits of concurrency. +**Solution:** The `World` object is the *only* sanctioned place for scenario +state. Treat each scenario as if it could be running at the same time as any +other. If you must interact with a shared, singular resource (like a physical +hardware device), you must tag the relevant scenarios with `@serial`. This +forces them to run one at a time.20 However, overuse of `@serial` is often a +sign of a poor test design and negates the performance benefits of concurrency. +This tag should be used sparingly. ### 7.2 Flaky Tests from Asynchronous Code -**Pitfall:** Tests that fail intermittently, often due to timing issues or race conditions in asynchronous code.30 A common mistake is using fixed delays ( +**Pitfall:** Tests that fail intermittently, often due to timing issues or race +conditions in asynchronous code.30 A common mistake is using fixed delays ( `tokio::time::sleep`) to "wait" for an operation to complete. **Solution:** -1. **Avoid Arbitrary Sleeps:** Never use fixed delays to wait for an event. The correct duration is impossible to guess and will lead to either slow tests or flaky tests. +1. **Avoid Arbitrary Sleeps:** Never use fixed delays to wait for an event. The + correct duration is impossible to guess and will lead to either slow tests or + flaky tests. -2. **Use Deterministic Mocks:** When possible, use tools like `wiremock-rs`. The interactions are deterministic and immediate, eliminating timing issues related to network latency.27 +2. **Use Deterministic Mocks:** When possible, use tools like `wiremock-rs`. + The interactions are deterministic and immediate, eliminating timing issues + related to network latency.27 -3. **Implement Explicit Synchronization:** When testing against real systems, use mechanisms like polling with a timeout, waiting for a specific log message, or checking a database flag to know when an operation is complete. +3. **Implement Explicit Synchronization:** When testing against real systems, + use mechanisms like polling with a timeout, waiting for a specific log + message, or checking a database flag to know when an operation is complete. -4. **Use Built-in Retries:** For tests that are inherently prone to transient failures (e.g., E2E tests over a real network), use the `cucumber` runner's retry mechanism (`--retry `) to automatically re-run failed scenarios.31 +4. **Use Built-in Retries:** For tests that are inherently prone to transient + failures (e.g., E2E tests over a real network), use the `cucumber` + runner's retry mechanism (`--retry `) to automatically re-run failed + scenarios.31 ### 7.3 The `unwrap()` Trap and Poor Error Handling -**Pitfall:** Littering step definitions with `.unwrap()` and `.expect()`. When these panic, the resulting error message is often generic and lacks the context needed to quickly diagnose the problem.22 For example, a panic on +**Pitfall:** Littering step definitions with `.unwrap()` and `.expect()`. When +these panic, the resulting error message is often generic and lacks the context +needed to quickly diagnose the problem.22 For example, a panic on -`world.last_response.as_ref().unwrap()` doesn't tell you which API call failed to produce a response. +`world.last_response.as_ref().unwrap()` does not indicate which API call failed +to produce a response. -**Solution:** As discussed in section 3.3, step functions should return a `Result`. Define custom, descriptive error types using crates like `thiserror` or `anyhow` to wrap underlying errors and add context. A well-defined `Err` variant is far more valuable for debugging than a stack trace from a panic.20 +**Solution:** As discussed in section 3.3, step functions should return a +`Result`. Define custom, descriptive error types using crates like `thiserror` +or `anyhow` to wrap underlying errors and add context. A well-defined `Err` +variant is far more valuable for debugging than a stack trace from a panic.20 ### 7.4 Ambiguous Step Definitions -**Pitfall:** The test run fails with an "ambiguous step" error. This means a single Gherkin step matches the patterns of two or more Rust functions.21 +**Pitfall:** The test run fails with an "ambiguous step" error. This means a +single Gherkin step matches the patterns of two or more Rust functions.21 **Solution:** -1. **Be More Specific:** Make the Gherkin step text or the matching pattern more precise to eliminate the ambiguity. +1. **Be More Specific:** Make the Gherkin step text or the matching pattern more + precise to eliminate the ambiguity. -2. **Anchor Regex:** When using regular expressions, always anchor them with `^` at the start and `$` at the end (e.g., `regex = r"^the user is logged in$"`). This prevents a step like `"the admin user is logged in"` from accidentally matching a less specific pattern like `regex = r"user is logged in"`.18 +2. **Anchor Regex:** When using regular expressions, always anchor them with `^` + at the start and `$` at the end (e.g., `regex = r"^the user is logged in$"`). + This prevents a step like `"the admin user is logged in"` from accidentally + matching a less specific pattern like `regex = r"user is logged in"`.18 ## Part 8: Integrating into the Development Lifecycle -BDD is most effective when it is an integral part of the daily development workflow and the automated CI/CD pipeline. +BDD is most effective when it is an integral part of the daily development +workflow and the automated CI/CD pipeline. ### 8.1 The Cucumber CLI: Running Tests with Precision -Running the entire test suite can be slow. The `cucumber` test runner supports a rich set of command-line arguments that allow you to run a targeted subset of scenarios. These arguments are passed to your test executable after a `--` separator: `cargo test --test cucumber --`.32 +Running the entire test suite can be slow. The `cucumber` test runner supports +a rich set of command-line arguments that allow you to run a targeted subset +of scenarios. These arguments are passed to your test executable after a `--` +separator: `cargo test --test cucumber --`.32 -
- -

Flag

Purpose

Example Usage

-t, --tags <EXPR>

Filter scenarios by a tag expression. Supports and, or, and not.

cargo test --test cucumber -- -t "@wip and not @slow"

-n, --name <REGEX>

Filter scenarios by a regular expression matching their name.

cargo test --test cucumber -- -n "login"

--fail-fast

Stop the test run on the first failure.

cargo test --test cucumber -- --fail-fast

-c, --concurrency <N>

Limit the number of scenarios running concurrently.

cargo test --test cucumber -- -c 1 (for serial execution)

--retry <N>

Retry failed scenarios up to N times.

cargo test --test cucumber -- --retry 2

+| Flag | Purpose | Example Usage | +| ----------------------- | ---------------------------------------------------------------- | --------------------------------------------------------- | +| `-t, --tags ` | Filter scenarios by a tag expression. Supports and, or, and not. | cargo test --test cucumber -- -t "@wip and not @slow" | +| `-n, --name ` | Filter scenarios by a regular expression matching their name. | cargo test --test cucumber -- -n "login" | +| --fail-fast | Stop the test run on the first failure. | cargo test --test cucumber -- --fail-fast | +| `-c, --concurrency ` | Limit the number of scenarios running concurrently. | cargo test --test cucumber -- -c 1 (for serial execution) | +| `--retry ` | Retry failed scenarios up to N times. | cargo test --test cucumber -- --retry 2 | -These flags are essential for developer productivity, enabling rapid feedback by running only the tests relevant to the current work. +These flags are essential for developer productivity, enabling rapid feedback by +running only the tests relevant to the current work. ### 8.2 Continuous Integration (CI/CD): Living Documentation in Practice -The ultimate goal of BDD is to have a suite of executable specifications that continuously validate the system's behaviour. Integrating Cucumber tests into a CI/CD pipeline is what brings this "living documentation" to life.33 +The ultimate goal of BDD is to have a suite of executable specifications that +continuously validate the system's behaviour. Integrating Cucumber tests into a +CI/CD pipeline is what brings this "living documentation" to life.33 The process involves two main steps: -1. **Run the tests:** The CI job executes `cargo test --test cucumber`. The `cucumber` runner will exit with a non-zero status code if any scenario fails, which automatically fails the CI build.33 +1. **Run the tests:** The CI job executes `cargo test --test cucumber`. The + `cucumber` runner will exit with a non-zero status code if any scenario + fails, which automatically fails the CI build.33 -2. **Publish reports:** Many CI platforms can parse and display test results in a structured format. The `cucumber` crate supports generating JUnit XML reports via the `output-junit` feature flag.16 These XML files can then be published as test artifacts for platforms like GitHub Actions, GitLab CI, or Jenkins to consume.33 +2. **Publish reports:** Many CI platforms can parse and display test results + in a structured format. The `cucumber` crate supports generating JUnit XML + reports via the `output-junit` feature flag.16 These XML files can then be + published as test artifacts for platforms like GitHub Actions, GitLab CI, or + Jenkins to consume.33 -This CI integration closes the BDD loop. The `.feature` files, once checked into version control, are no longer static documents. They become active participants in the build process. A CI failure on a Cucumber test provides immediate, unambiguous feedback that the implementation has diverged from the agreed-upon behaviour, prompting a conversation to either fix the code or update the specification. +This CI integration closes the BDD loop. The `.feature` files, once checked into +version control, are no longer static documents. They become active participants +in the build process. A CI failure on a Cucumber test provides immediate, +unambiguous feedback that the implementation has diverged from the agreed- +upon behaviour, prompting a conversation to either fix the code or update the +specification. **Worked Example (GitHub Actions):** YAML -``` -# In.github/workflows/ci.yml +```yaml +# .github/workflows/ci.yml name: Rust CI on: @@ -775,82 +1030,151 @@ jobs: path: target/junit/*.xml ``` -This workflow demonstrates a standard CI setup for a Rust project, including linting, formatting, and running the Cucumber tests. The final step ensures that the test results are always available for inspection, providing a clear and continuous record of the application's behavioural health.35 +This workflow demonstrates a standard CI setup for a Rust project, including +linting, formatting, and running the Cucumber tests. The final step ensures +that the test results are always available for inspection, providing a clear and +continuous record of the application's behavioural health.35 ### Conclusion -Behaviour-Driven Development with the `cucumber` crate offers a powerful paradigm for building robust, well-documented, and correct Rust applications. By shifting the focus from "testing" to "specifying behaviour," BDD fosters collaboration and creates a shared language that bridges the gap between business and technical teams. - -For Rust developers, the `cucumber-rs` ecosystem provides a modern, idiomatic, and high-performance toolset. Its async-first design, combined with the safety of the `World`-based state management, makes it uniquely suited for writing the comprehensive integration and E2E tests required by today's complex, I/O-bound systems. By mastering the Gherkin syntax, embracing best practices for scenario and step definition, and integrating these executable specifications into an automated CI/CD pipeline, development teams can build higher-quality software with greater confidence and clarity, ensuring that what they build is always aligned with what is needed. +Behaviour-Driven Development with the `cucumber` crate offers a powerful +paradigm for building robust, well-documented, and correct Rust applications. +By shifting the focus from "testing" to "specifying behaviour," BDD fosters +collaboration and creates a shared language that bridges the gap between +business and technical teams. + +For Rust developers, the `cucumber-rs` ecosystem provides a modern, idiomatic, +and high-performance toolset. Its async-first design, combined with the safety +of the `World`-based state management, makes it uniquely suited for writing the +comprehensive integration and E2E tests required by today's complex, I/O-bound +systems. By mastering the Gherkin syntax, embracing best practices for scenario +and step definition, and integrating these executable specifications into an +automated CI/CD pipeline, development teams can build higher-quality software +with greater confidence and clarity, ensuring that what they build is always +aligned with what is needed. #### **Works cited** - 1. "Given When Then" Framework: a step-by-step guide with examples - Miro, accessed on July 14, 2025, + 1. "Given When Then" Framework: a step-by-step guide with examples — Miro, + accessed on July 14, 2025, + framework/> - 2. Is it acceptable to write a "Given When Then When Then" test in Gherkin? - Stack Overflow, accessed on July 14, 2025, + 2. Is it acceptable to write a "Given When Then When Then" test in Gherkin? + - Stack Overflow, accessed on July 14, 2025, + questions/12060011/is-it-acceptable-to-write-a-given-when-then-when-then- + test-in-gherkin> - 3. Gherkin in Testing: A Beginner's Guide | by Rafał Buczyński | Medium, accessed on July 14, 2025, + 3. Gherkin in Testing: A Beginner's Guide | by Rafał Buczyński | Medium, + accessed on July 14, 2025, + in-> testing-a-beginners-guide-f2e179d5e2df> - 4. Gherkin Syntax in Cucumber - Tutorialspoint, accessed on July 14, 2025, + 4. Gherkin Syntax in Cucumber - Tutorialspoint, accessed on July 14, 2025, + - 5. Given When Then - Martin Fowler, accessed on July 14, 2025, + 5. Given When Then - Martin Fowler, accessed on July 14, 2025, - 6. How To Start Writing Gherkin Test Scenarios? - [Selleo.com](http://Selleo.com), accessed on July 14, 2025, + 6. How To Start Writing Gherkin Test Scenarios? - [Selleo.com](http:// + Selleo.com), accessed on July 14, 2025, + start-writing-gherkin-test-scenarios> - 7. Reference - Cucumber, accessed on July 14, 2025, + 7. Reference - Cucumber, accessed on July 14, 2025, + docs/> gherkin/reference/> - 8. BDD (Behavior Driven Development) - ROBOT FRAMEWORK, accessed on July 14, 2025, + 8. BDD (Behaviour-Driven Development) - ROBOT FRAMEWORK, accessed on July 14, + 2025, - 9. Given-When-Then - Wikipedia, accessed on July 14, 2025, + 9. Given-When-Then - Wikipedia, accessed on July 14, 2025, -10. When to Use "Given-When-Then" Acceptance Criteria - Ranorex, accessed on July 14, 2025, +10. When to Use "Given-When-Then" Acceptance Criteria - Ranorex, accessed on + July 14, 2025, -11. Writing scenarios with Gherkin syntax | CucumberStudio Documentation, accessed on July 14, 2025, +11. Writing scenarios with Gherkin syntax | CucumberStudio Documentation, + accessed on July 14, 2025, + docs/bdd/write-gherkin-scenarios.html> -12. Introduction - Cucumber Rust Book, accessed on July 14, 2025, +12. Introduction - Cucumber Rust Book, accessed on July 14, 2025, -13. Rust BDD tests with Cucumber - DEV Community, accessed on July 14, 2025, +13. Rust BDD tests with Cucumber - DEV Community, accessed on July 14, 2025, + -14. Cucumber testing framework for Rust. Fully native, no external test runners or dependencies. - GitHub, accessed on July 14, 2025, +14. Cucumber testing framework for Rust. Fully native, no external test runners + or dependencies. - GitHub, accessed on July 14, 2025, + AidaPaul/cucumber-rust> -15. Cucumber testing framework for Rust. Fully native, no external test runners or dependencies. - GitHub, accessed on July 14, 2025, +15. Cucumber testing framework for Rust. Fully native, no external test runners + or dependencies. - GitHub, accessed on July 14, 2025, + cucumber-rs/cucumber> -16. cucumber - Rust - [Docs.rs](http://Docs.rs), accessed on July 14, 2025, +16. cucumber - Rust - [Docs.rs](http://Docs.rs), accessed on July 14, 2025, + -17. Cucumber testing framework for Rust - [Crates.io](http://Crates.io), accessed on July 14, 2025, +17. Cucumber testing framework for Rust - [Crates.io](http://Crates.io), + accessed on July 14, 2025, -18. Quickstart - Cucumber Rust Book, accessed on July 14, 2025, +18. Quickstart - Cucumber Rust Book, accessed on July 14, 2025, -19. Cucumber in Rust - Beginner's Tutorial - Florianrein's Blog, accessed on July 14, 2025, +19. Cucumber in Rust - Beginner's Tutorial - Florianrein's Blog, accessed on + July 14, 2025, + tutorial/> -20. Quickstart - Cucumber Rust Book, accessed on July 14, 2025, +20. Quickstart - Cucumber Rust Book, accessed on July 14, 2025, -21. Common Pitfalls and Troubleshooting in Cucumber - GeeksforGeeks, accessed on July 14, 2025, +21. Common Pitfalls and Troubleshooting in Cucumber - GeeksforGeeks, accessed + on July 14, 2025, + pitfalls-and-troubleshooting-in-cucumber/> -22. How to do error handling in Rust and what are the common pitfalls? - Stack Overflow, accessed on July 14, 2025, +22. How to do error handling in Rust and what are the common pitfalls? - + Stack Overflow, accessed on July 14, 2025, + questions/30505639/how-to-do-error-handling-in-rust-and-what-are-the-common- + pitfalls> -23. Data tables - Cucumber Rust Book, accessed on July 14, 2025, +23. Data tables - Cucumber Rust Book, accessed on July 14, 2025, -24. Cucumber Data Tables - Tutorialspoint, accessed on July 14, 2025, +24. Cucumber Data Tables - Tutorialspoint, accessed on July 14, 2025, + -25. Best practices for scenario writing | CucumberStudio Documentation - SmartBear Support, accessed on July 14, 2025, +25. Best practices for scenario writing | CucumberStudio Documentation + - SmartBear Support, accessed on July 14, 2025, -26. Cucumber Best Practices to follow for efficient BDD Testing | by KailashPathak - Medium, accessed on July 14, 2025, +26. Cucumber Best Practices to follow for efficient BDD Testing | by + KailashPathak - Medium, accessed on July 14, 2025, + pathak.medium.com/cucumber-best-practices-to-follow-for-efficient-bdd- + testing-b3eb1c7e9757> -27. Rust Solutions - WireMock, accessed on July 14, 2025, +27. Rust Solutions - WireMock, accessed on July 14, 2025, docs/solutions/rust/> -28. Unit-testing a web service in Rust - Julio Merino ([jmmv.dev](http://jmmv.dev)), accessed on July 14, 2025, +28. Unit-testing a web service in Rust - Julio Merino ([jmmv.dev](http:// + jmmv.dev)), accessed on July 14, 2025, + testing-a-web-service.html> -29. Cucumber Best Practices for Effective BDD Testing - BrowserStack, accessed on July 14, 2025, +29. Cucumber Best Practices for Effective BDD Testing - BrowserStack, accessed + on July 14, 2025, + practices-for-testing> -30. Common Challenges in Cucumber Testing and How to Overcome Them - Medium, accessed on July 14, 2025, +30. Common Challenges in Cucumber Testing and How to Overcome Them - Medium, + accessed on July 14, 2025, + challenges-in-cucumber-testing-and-how-to-overcome-them-dc95fffb43c8> -31. Cucumber in cucumber - Rust - [Docs.rs](http://Docs.rs), accessed on July 14, 2025, +31. Cucumber in cucumber - Rust - [Docs.rs](http://Docs.rs), accessed on July + 14, 2025, -32. CLI (command-line interface) - Cucumber Rust Book, accessed on July 14, 2025, +32. CLI (command-line interface) - Cucumber Rust Book, accessed on July 14, + 2025, -33. Continuous Integration - Cucumber, accessed on July 14, 2025, +33. Continuous Integration - Cucumber, accessed on July 14, 2025, -34. GitLab CI/CD examples, accessed on July 14, 2025, +34. GitLab CI/CD examples, accessed on July 14, 2025, > ci/examples/> -35. Setting up effective CI/CD for Rust projects - a short primer - [shuttle.dev](http://shuttle.dev), accessed on July 14, 2025, +35. Setting up effective CI/CD for Rust projects - a short primer - + [shuttle.dev](http://shuttle.dev), accessed on July 14, 2025, > diff --git a/docs/netsuke-design.md b/docs/netsuke-design.md index 2309950f..dd8f022d 100644 --- a/docs/netsuke-design.md +++ b/docs/netsuke-design.md @@ -262,11 +262,10 @@ Each entry in the `rules` list is a mapping that defines a reusable action. field (defaulting to `/bin/sh -e`). For `/bin/sh` scripts, each interpolation is automatically passed through the `shell_escape` filter unless a `| raw` filter is applied. Future versions will allow configurable script languages - with their own escaping rules. - On Windows, scripts default to `powershell -Command` unless the manifest's - `interpreter` field overrides the setting. Exactly one of `command` or - `script` must be provided. The manifest parser enforces this rule to prevent - invalid states. + with their own escaping rules. On Windows, scripts default to `powershell + -Command` unless the manifest's `interpreter` field overrides the setting. + Exactly one of `command` or `script` must be provided. The manifest parser + enforces this rule to prevent invalid states. Internally, these options deserialize into a shared `Recipe` enum tagged with a `kind` field. Serde aliases ensure manifests that omit the tag continue to @@ -1388,9 +1387,9 @@ possibilities for future enhancements beyond the initial scope. ## Section 10: Example Manifests -The repository includes several complete Netsuke manifests in the -`examples/` directory. They demonstrate how the YAML schema can be applied -to real-world projects. +The repository includes several complete Netsuke manifests in the `examples/ +` directory. They demonstrate how the YAML schema can be applied to real-world +projects. - [`basic_c.yml`](../examples/basic_c.yml): a minimal C project compiling two object files and linking them into a small application. diff --git a/docs/roadmap.md b/docs/roadmap.md index 3858d2ac..900c5532 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -1,8 +1,8 @@ # Netsuke Implementation Roadmap -This roadmap translates the [netsuke-design.md](http://netsuke-design.md) document into a phased, -actionable implementation plan. Each phase has a clear objective and a checklist -of tasks that must be completed to meet the success criteria. +This roadmap translates the [netsuke-design.md](netsuke-design.md) document into +a phased, actionable implementation plan. Each phase has a clear objective and a +checklist of tasks that must be completed to meet the success criteria. ## Phase 1: The Static Core 🏗️ @@ -17,10 +17,10 @@ compilation pipeline from parsing to execution. document. - [ ] Define the core Abstract Syntax Tree (AST) data structures - (NetsukeManifest, Rule, Target, StringOrList, Recipe) in - src/[ast.rs](http://ast.rs). + (NetsukeManifest, Rule, Target, StringOrList, Recipe) in `src/ast.rs`. - - [ ] Annotate AST structs with #[derive(Deserialize)] and #[serde(deny_unknown_fields)] + - [ ] Annotate AST structs with #[derive(Deserialize)] and + #[serde(deny_unknown_fields)] to enable serde_yaml parsing. - [ ] Implement parsing for the netsuke_version field and validate it using @@ -37,7 +37,7 @@ compilation pipeline from parsing to execution. - [ ] **Intermediate Representation (IR) and Validation:** - [ ] Define the IR data structures (BuildGraph, Action, BuildEdge) in - src/[ir.rs](http://ir.rs), keeping it backend-agnostic as per the design. + `src/ir.rs`, keeping it backend-agnostic as per the design. - [ ] Implement the ir::from_manifest transformation logic to convert the AST into the BuildGraph IR. @@ -53,25 +53,25 @@ compilation pipeline from parsing to execution. - [ ] **Code Generation and Execution:** - - [ ] Implement the Ninja file synthesizer in src/ninja_[gen.rs](http://gen.rs) to traverse - the BuildGraph IR. + - [ ] Implement the Ninja file synthesizer in + [src/ninja_gen.rs](src/ninja_gen.rs) to traverse the BuildGraph IR. - [ ] Write logic to generate Ninja rule statements from ir::Action structs and build statements from ir::BuildEdge structs. - - [ ] Implement the process management logic in [main.rs](http://main.rs) to invoke the ninja - executable as a subprocess using std::process::Command. + - [ ] Implement the process management logic in `main.rs` to invoke the ninja + executable as a subprocess using `std::process::Command`. - **Success Criterion:** - - [ ] Netsuke can successfully take a Netsukefile without any Jinja - syntax, compile it to a [build.ninja](http://build.ninja) file, and execute it via the ninja - subprocess to produce the correct build artifacts. + - [ ] Netsuke can successfully take a Netsukefile without any Jinja syntax, + compile it to a `build.ninja` file, and execute it via the ninja subprocess + to produce the correct build artifacts. ## Phase 2: The Dynamic Engine ✨ -Objective: To integrate the minijinja templating engine, enabling dynamic -build configurations with variables, control flow, and custom functions. +Objective: To integrate the minijinja templating engine, enabling dynamic build +configurations with variables, control flow, and custom functions. - [ ] **Jinja Integration:** @@ -86,8 +86,8 @@ build configurations with variables, control flow, and custom functions. - [ ] **Dynamic Features and Custom Functions:** - - [ ] Implement support for basic Jinja control structures ({% if %}, {% for %}) - and the foreach key for target generation. + - [ ] Implement support for basic Jinja control structures ({% if %}, {% for + %}) and the foreach key for target generation. - [ ] Implement the essential custom Jinja function env(var_name) to read system environment variables. @@ -101,14 +101,14 @@ build configurations with variables, control flow, and custom functions. - **Success Criterion:** - [ ] Netsuke can successfully build a manifest that uses variables, - conditional logic, the foreach loop, custom macros, and the glob() - function to discover and operate on source files. + conditional logic, the foreach loop, custom macros, and the glob() function + to discover and operate on source files. ## Phase 3: The "Friendly" Polish 🛡️ -Objective: To implement the advanced features that deliver a superior, -secure, and robust user experience, focusing on security, error reporting, the -standard library, and CLI ergonomics. +Objective: To implement the advanced features that deliver a superior, secure, +and robust user experience, focusing on security, error reporting, the standard +library, and CLI ergonomics. - [ ] **Security and Shell Escaping:** @@ -142,7 +142,8 @@ standard library, and CLI ergonomics. - [ ] Implement the path and file filters (basename, dirname, with_suffix, realpath, contents, hash, etc.). - - [ ] Implement the generic collection filters (`uniq`, `flatten`, `group_by`). + - [ ] Implement the generic collection filters (`uniq`, `flatten`, + `group_by`). - [ ] Implement the network and command functions/filters (fetch, shell, grep), ensuring shell marks templates as impure to disable caching. diff --git a/docs/rust-testing-with-rstest-fixtures.md b/docs/rust-testing-with-rstest-fixtures.md index 4cf47893..2894ed1f 100644 --- a/docs/rust-testing-with-rstest-fixtures.md +++ b/docs/rust-testing-with-rstest-fixtures.md @@ -1,8 +1,8 @@ # Mastering Test Fixtures in Rust with `rstest` Testing is an indispensable part of modern software development, ensuring code -reliability, maintainability, and correctness. In the Rust ecosystem, while the -built-in testing framework provides a solid foundation, managing test +reliability, maintainability, and correctness. In the Rust ecosystem, while +the built-in testing framework provides a solid foundation, managing test dependencies and creating parameterized tests can become verbose. The `rstest` crate () emerges as a powerful solution, offering a sophisticated fixture-based and parameterized testing framework that @@ -25,8 +25,8 @@ Managing this setup and teardown logic within each test function can lead to considerable boilerplate code and repetition, making tests harder to read and maintain. -Fixtures address this by encapsulating these dependencies and their setup logic. -For instance, if multiple tests require a logged-in user object or a +Fixtures address this by encapsulating these dependencies and their setup +logic. For instance, if multiple tests require a logged-in user object or a pre-populated database, instead of creating these in every test, a fixture can provide them. This approach allows developers to focus on the specific logic being tested rather than the auxiliary utilities. @@ -43,8 +43,8 @@ become shorter, more focused, and thus more readable and maintainable. `rstest` is a Rust crate specifically designed to simplify and enhance testing by leveraging the concept of fixtures and providing powerful parameterization -capabilities. It is available on `crates.io` and its source code is hosted at -, distinguishing it from other software +capabilities. It is available on `crates.io` and its source code is hosted +at , distinguishing it from other software projects that may share the same name but operate in different ecosystems (e.g., a JavaScript/TypeScript framework mentioned). @@ -54,12 +54,12 @@ allow developers to define fixtures and inject them into test functions simply by listing them as arguments. This compile-time mechanism analyzes test function signatures and fixture definitions to wire up dependencies automatically. -This reliance on procedural macros is a key architectural decision. It enables -`rstest` to offer a remarkably clean and intuitive syntax at the test-writing -level. Developers declare the dependencies their tests need, and the macros -handle the resolution and injection. While this significantly improves the -developer experience for writing tests, the underlying macro expansion involves -compile-time code generation. This complexity, though hidden, can have +This reliance on procedural macros is a key architectural decision. It +enables `rstest` to offer a remarkably clean and intuitive syntax at the test- +writing level. Developers declare the dependencies their tests need, and the +macros handle the resolution and injection. While this significantly improves +the developer experience for writing tests, the underlying macro expansion +involves compile-time code generation. This complexity, though hidden, can have implications for build times, particularly in large test suites. Furthermore, understanding the macro expansion can sometimes be necessary for debugging complex test scenarios or unexpected behaviour. @@ -75,8 +75,8 @@ quality and developer productivity: developers to "focus on the important stuff in your tests" by abstracting away the setup details. - **Reusability:** Fixtures defined with `rstest` are reusable components. A - single fixture, such as one setting up a database connection or creating a - complex data structure, can be used across multiple tests, eliminating + single fixture, such as one setting up a database connection or creating + a complex data structure, can be used across multiple tests, eliminating redundant setup code. - **Reduced Boilerplate:** `rstest` significantly cuts down on repetitive setup and teardown code. Parameterization features, like `#[case]` and `#[values]`, @@ -119,8 +119,8 @@ libraries. This convention prevents testing utilities from being included in production binaries, which helps keep them small and reduces compile times for non-test builds. -When leveraging Tokio's test utilities—for example `tokio::time::pause` or the -I/O helpers in `tokio-test`—enable the `test-util` feature via a dev-only +When leveraging Tokio's test utilities—for example `tokio::time::pause` or +the I/O helpers in `tokio-test`—enable the `test-util` feature via a dev-only dependency: ```toml @@ -131,9 +131,9 @@ rstest = "0.18" ### B. Your First Fixture: Defining with `#[fixture]` -A fixture in `rstest` is essentially a Rust function that provides some data or -performs some setup action, with its result being injectable into tests. To -designate a function as a fixture, it is annotated with the `#[fixture]` +A fixture in `rstest` is essentially a Rust function that provides some data +or performs some setup action, with its result being injectable into tests. +To designate a function as a fixture, it is annotated with the `#[fixture]` attribute. Consider a simple fixture that provides a numeric value: @@ -150,8 +150,8 @@ pub fn answer_to_life() -> u32 { In this example, `answer_to_life` is a public function marked with `#[fixture]`. It takes no arguments and returns a `u32` value of 42. The `#[fixture]` macro effectively registers this function with the `rstest` system, transforming it -into a component that `rstest` can discover and utilize. The return type of the -fixture function (here, `u32`) defines the type of the data that will be +into a component that `rstest` can discover and utilize. The return type of +the fixture function (here, `u32`) defines the type of the data that will be injected into tests requesting this fixture. Fixtures can return any valid Rust type, from simple primitives to complex structs or trait objects. Fixtures can also depend on other fixtures, allowing for compositional setup. @@ -180,16 +180,16 @@ fn test_with_fixture(answer_to_life: u32) { ``` In `test_with_fixture`, the argument `answer_to_life: u32` signals to `rstest` -that the `answer_to_life` fixture should be injected. `rstest` resolves this by -name: it looks for a fixture function named `answer_to_life`, calls it, and +that the `answer_to_life` fixture should be injected. `rstest` resolves this +by name: it looks for a fixture function named `answer_to_life`, calls it, and passes its return value as the argument to the test function. The argument name in the test function serves as the primary key for fixture resolution. This convention makes usage intuitive but necessitates careful -naming of fixtures to avoid ambiguity, especially if multiple fixtures with the -same name exist in different modules but are brought into the same scope. -`rstest` generally follows Rust's standard name resolution rules, meaning an -identically named fixture can be used in different contexts depending on +naming of fixtures to avoid ambiguity, especially if multiple fixtures with +the same name exist in different modules but are brought into the same scope. +`rstest` generally follows Rust's standard name resolution rules, meaning +an identically named fixture can be used in different contexts depending on visibility and `use` declarations. ## III. Mastering Fixture Injection and Basic Usage @@ -199,97 +199,97 @@ leveraging `rstest` effectively. ### A. Simple Fixture Examples -The flexibility of `rstest` fixtures allows them to provide a wide array of data -types and perform various setup tasks. Fixtures are not limited by the kind of -data they can return; any valid Rust type is permissible. This enables fixtures -to encapsulate diverse setup logic, providing ready-to-use dependencies for -tests. +The flexibility of `rstest` fixtures allows them to provide a wide array of +data types and perform various setup tasks. Fixtures are not limited by the +kind of data they can return; any valid Rust type is permissible. This enables +fixtures to encapsulate diverse setup logic, providing ready-to-use dependencies +for tests. Here are a few examples illustrating different kinds of fixtures: - **Fixture returning a primitive data type:** - ```rust - use rstest::*; +```rust +use rstest::*; - #[fixture] - fn default_username() -> String { - "test_user".to_string() - } +#[fixture] +fn default_username() -> String { + "test_user".to_string() +} - #[rstest] - fn test_username_length(default_username: String) { - assert!(default_username.len() > 0); - } +#[rstest] +fn test_username_length(default_username: String) { + assert!(default_username.len() > 0); +} - ``` +``` - **Fixture returning a struct:** - ```rust - use rstest::*; +```rust +use rstest::*; - struct User { - id: u32, - name: String, - } +struct User { + id: u32, + name: String, +} - #[fixture] - fn sample_user() -> User { - User { - id: 1, - name: "Alice".to_string(), - } - } +#[fixture] +fn sample_user() -> User { + User { + id: 1, + name: "Alice".to_string(), + } +} - #[rstest] - fn test_sample_user_id(sample_user: User) { - assert_eq!(sample_user.id, 1); - } +#[rstest] +fn test_sample_user_id(sample_user: User) { + assert_eq!(sample_user.id, 1); +} - ``` +``` - **Fixture performing setup and returning a resource (e.g., a mock repository):** - ```rust - use rstest::*; - use std::collections::HashMap; - - // A simple trait for a repository - trait Repository { - fn add_item(&mut self, id: &str, name: &str); - fn get_item_name(&self, id: &str) -> Option; - } - - // A mock implementation - # - struct MockRepository { - data: HashMap, - } - - impl Repository for MockRepository { - fn add_item(&mut self, id: &str, name: &str) { - self.data.insert(id.to_string(), name.to_string()); - } - - fn get_item_name(&self, id: &str) -> Option { - self.data.get(id).cloned() - } - } - - #[fixture] - fn empty_repository() -> impl Repository { - MockRepository::default() - } - - #[rstest] - fn test_add_to_repository(mut empty_repository: impl Repository) { - empty_repository.add_item("item1", "Test Item"); - assert_eq!(empty_repository.get_item_name("item1"), Some("Test Item".to_string())); - } - - ``` +```rust +use rstest::*; +use std::collections::HashMap; + +trait Repository { + fn add_item(&mut self, id: &str, name: &str); + fn get_item_name(&self, id: &str) -> Option; +} + +struct MockRepository { + data: HashMap, +} + +impl Repository for MockRepository { + fn add_item(&mut self, id: &str, name: &str) { + self.data.insert(id.to_string(), name.to_string()); + } + + fn get_item_name(&self, id: &str) -> Option { + self.data.get(id).cloned() + } +} + +#[fixture] +fn empty_repository() -> impl Repository { + MockRepository::default() +} + +#[rstest] +fn test_add_to_repository(mut empty_repository: impl Repository) { + empty_repository.add_item("item1", "Test Item"); + assert_eq!( + empty_repository.get_item_name("item1"), + Some("Test Item".to_string()) + ); +} + +``` This example demonstrates a fixture providing a mutable `Repository` implementation. @@ -301,9 +301,9 @@ This means if five different tests inject the same fixture, the fixture function will be executed five times, and each test will receive a fresh, independent instance of the fixture's result. This behaviour is crucial for test isolation. The `rstest` macro effectively desugars a test like `fn the_test(injected: i32)` -into something conceptually similar to -`#[test] fn the_test() { let injected = injected_fixture_func(); /*... */ }` -within the test body, implying a new call each time. +into something conceptually similar to `#[test] fn the_test() { let injected = +injected_fixture_func(); /* … */ }` within the test body, implying a new call +each time. Test isolation prevents the state from one test from inadvertently affecting another. If fixtures were shared by default, a mutation to a fixture's state in @@ -324,7 +324,7 @@ the `#[case]` and `#[values]` attributes. ### A. Table-Driven Tests with `#[case]`: Defining Specific Scenarios -The `#[case(...)]` attribute enables table-driven testing, where each `#[case]` +The `#[case(…)]` attribute enables table-driven testing, where each `#[case]` defines a specific scenario with a distinct set of input arguments for the test function. Arguments within the test function that are intended to receive these values must also be annotated with `#[case]`. @@ -354,10 +354,10 @@ fn test_fibonacci(#[case] input: u32, #[case] expected: u32) { } ``` -For each `#[case(input_val, expected_val)]` line, `rstest` generates a separate, -independent test. If one case fails, the others are still executed and reported -individually by the test runner. These generated tests are often named by -appending `::case_N` to the original test function name (e.g., +For each `#[case(input_val, expected_val)]` line, `rstest` generates a +separate, independent test. If one case fails, the others are still executed +and reported individually by the test runner. These generated tests are +often named by appending `::case_N` to the original test function name (e.g., `test_fibonacci::case_1`, `test_fibonacci::case_2`, etc.), which aids in identifying specific failing cases. This individual reporting mechanism provides clearer feedback than a loop within a single test, where the first failure might @@ -365,7 +365,7 @@ obscure subsequent ones. ### B. Combinatorial Testing with `#[values]`: Generating Test Matrices -The `#[values(...)]` attribute is used on test function arguments to generate +The `#[values(…)]` attribute is used on test function arguments to generate tests for every possible combination of the provided values (the Cartesian product). This is particularly useful for testing interactions between different parameters or ensuring comprehensive coverage across various input states. @@ -412,18 +412,18 @@ all combinations of `initial_state` and `event` specified in the `#[values]` attributes. It is important to be mindful that the number of generated tests can grow very -rapidly with `#[values]`. If a test function has three arguments, each with ten -values specified via `#[values]`, 10×10×10=1000 tests will be generated. This -combinatorial explosion can significantly impact test execution time and even -compile times. Developers must balance the desire for exhaustive combinatorial -coverage against these practical constraints, perhaps by selecting +rapidly with `#[values]`. If a test function has three arguments, each with +ten values specified via `#[values]`, 10×10×10=1000 tests will be generated. +This combinatorial explosion can significantly impact test execution time +and even compile times. Developers must balance the desire for exhaustive +combinatorial coverage against these practical constraints, perhaps by selecting representative values or using `#[case]` for more targeted scenarios. ### C. Using Fixtures within Parameterized Tests -Fixtures can be seamlessly combined with parameterized arguments (`#[case]` or -`#[values]`) in the same test function. This powerful combination allows for -testing different aspects of a component (varied by parameters) within a +Fixtures can be seamlessly combined with parameterized arguments (`#[case]` +or `#[values]`) in the same test function. This powerful combination allows +for testing different aspects of a component (varied by parameters) within a consistent environment or context (provided by fixtures). The "Complete Example" in the `rstest` documentation hints at this synergy, stating that all features can be used together, mixing fixture variables, fixed cases, and value lists. @@ -492,9 +492,9 @@ fn test_composed_fixture_with_override(#[with("special_")] configured_item: Stri } ``` -In this example, `derived_value` depends on `base_value`, and `configured_item` -depends on `derived_value`. When `test_composed_fixture` requests -`configured_item`, `rstest` first calls `base_value()`, then +In this example, `derived_value` depends on `base_value`, and +`configured_item` depends on `derived_value`. When `test_composed_fixture` +requests `configured_item`, `rstest` first calls `base_value()`, then `derived_value(10)`, and finally `configured_item(20, "item_".to_string())`. This hierarchical dependency resolution mirrors good software design principles, promoting modularity and maintainability in test setups. @@ -514,7 +514,7 @@ use std::sync::atomic::{AtomicUsize, Ordering}; #[once] fn expensive_setup() -> &'static AtomicUsize { // Simulate expensive setup - println!("Performing expensive_setup once..."); + println!("Performing expensive_setup once…"); static COUNTER: AtomicUsize = AtomicUsize::new(0); COUNTER.fetch_add(1, Ordering::Relaxed); // To demonstrate it's called once &COUNTER @@ -536,8 +536,8 @@ When using `#[once]`, there are critical warnings: 1. **Resource Lifetime:** The value returned by an `#[once]` fixture is effectively promoted to a `static` lifetime and is **never dropped**. This means any resources it holds (e.g., file handles, network connections) that - require explicit cleanup via `Drop` will not be cleaned up automatically at - the end of the test suite. This makes `#[once]` fixtures best suited for + require explicit cleanup via `Drop` will not be cleaned up automatically + at the end of the test suite. This makes `#[once]` fixtures best suited for truly passive data or resources whose cleanup is managed by the operating system upon process exit. 2. **Functional Limitations:** `#[once]` fixtures cannot be `async` functions @@ -547,8 +547,8 @@ When using `#[once]`, there are critical warnings: attributes. If you rely on lint expectations, use `#[allow]` instead to silence false positives. -The "never dropped" behaviour arises because `rstest` typically creates a -`static` variable to hold the result of the `#[once]` fixture. `static` +The "never dropped" behaviour arises because `rstest` typically creates +a `static` variable to hold the result of the `#[once]` fixture. `static` variables in Rust live for the entire duration of the program, and their `Drop` implementations are not usually called at program exit. This is a crucial consideration for resource management. @@ -588,14 +588,13 @@ argument pattern to the correct source fixture. ### D. Partial Fixture Injection & Default Arguments `rstest` provides mechanisms for creating highly configurable "template" -fixtures using `#[default(...)]` for fixture arguments and `#[with(...)]` to +fixtures using `#[default(…)]` for fixture arguments and `#[with(…)]` to override these defaults on a per-test basis. -- `#[default(...)]`: Used within a fixture function's signature to provide - default values for its own arguments. -- `#[with(...)]`: Used on a test function's fixture argument (or a fixture - argument within another fixture) to supply specific values to the parameters - of the invoked fixture, overriding any defaults. +- `#[default(…)]`: Used within a fixture function's signature to provide + default values for its arguments. +- `#[with(…)]`: Applied to a fixture argument in a test (or in another + fixture) to supply explicit values and override any defaults. ```rust use rstest::*; @@ -670,17 +669,17 @@ fn check_socket_port(#[case] addr: SocketAddr, #[case] expected_port: u16) { } ``` -In this test, `rstest` sees the argument `addr: SocketAddr` and the string -literal `"127.0.0.1:8080"`. It implicitly calls +In this test, `rstest` sees the argument `addr: SocketAddr` +and the string literal `"127.0.0.1:8080"`. It implicitly calls `SocketAddr::from_str("127.0.0.1:8080")` to create the `SocketAddr` instance. -This "magic" conversion makes test definitions more concise and readable by -allowing the direct use of string representations for types that support it. -However, if the `FromStr` conversion fails (e.g., due to a malformed string), -the error will typically occur at test runtime, potentially leading to a panic. -For types with complex parsing logic or many failure modes, it might be clearer -to perform the conversion explicitly within a fixture or at the beginning of the -test to handle errors more gracefully or provide more specific diagnostic -messages. +This "magic" conversion makes test definitions more concise and readable +by allowing the direct use of string representations for types that support +it. However, if the `FromStr` conversion fails (e.g., because of a malformed +string), the error will typically occur at test runtime, potentially leading to +a panic. For types with complex parsing logic or many failure modes, it might +be clearer to perform the conversion explicitly within a fixture or at the +beginning of the test to handle errors more gracefully or provide more specific +diagnostic messages. ## VI. Asynchronous Testing with `rstest` @@ -712,12 +711,12 @@ default async runtime support, but the fixture logic can be any async code. ### B. Writing Asynchronous Tests (`async fn` with `#[rstest]`) -Test functions themselves can also be `async fn`. `rstest` will manage the -execution of these async tests. By default, `rstest` often uses -`#[async_std::test]` to annotate the generated async test functions. However, it -is designed to be largely runtime-agnostic and can be integrated with other -popular async runtimes like Tokio or Actix. This is typically done by adding the -runtime's specific test attribute (e.g., `#[tokio::test]` or +Test functions themselves can also be `async fn`. `rstest` will manage +the execution of these async tests. By default, `rstest` often uses +`#[async_std::test]` to annotate the generated async test functions. However, +it is designed to be largely runtime-agnostic and can be integrated with +other popular async runtimes like Tokio or Actix. This is typically done +by adding the runtime's specific test attribute (e.g., `#[tokio::test]` or `#[actix_rt::test]`) alongside `#[rstest]`. ```rust @@ -739,8 +738,8 @@ async fn my_async_test(async_fixture_value: u32) { } ``` -The order of procedural macro attributes can sometimes matter. While `rstest` -documentation and examples show flexibility (e.g., `#[rstest]` then +The order of procedural macro attributes can sometimes matter. While +`rstest` documentation and examples show flexibility (e.g., `#[rstest]` then `#[tokio::test]`, or vice versa), users should ensure their chosen async runtime's test macro is correctly placed to provide the necessary execution context for the async test body and any async fixtures. `rstest` itself does not @@ -811,9 +810,9 @@ away some of the explicit `async`/`.await` mechanics. ### D. Test Timeouts for Async Tests (`#[timeout]`) Long-running or stalled asynchronous operations can cause tests to hang -indefinitely. `rstest` provides a `#[timeout(...)]` attribute to set a maximum -execution time for async tests. This feature typically relies on the -`async-timeout` feature of `rstest`, which is enabled by default. +indefinitely. `rstest` provides a `#[timeout(…)]` attribute to set a maximum +execution time for async tests. This feature typically relies on the `async- +timeout` feature of `rstest`, which is enabled by default. ```rust use rstest::*; @@ -900,24 +899,24 @@ fn test_read_from_temp_file(temp_file_with_content: PathBuf) { ``` By encapsulating temporary resource management within fixtures, tests become -cleaner and less prone to errors related to resource setup or cleanup. The RAII -(Resource Acquisition Is Initialization) pattern, common in Rust and exemplified -by `tempfile::TempDir` (which cleans up the directory when dropped), works -effectively with `rstest`'s fixture model. When a regular (non-`#[once]`) +cleaner and less prone to errors related to resource setup or cleanup. The +RAII (Resource Acquisition Is Initialization) pattern, common in Rust and +exemplified by `tempfile::TempDir` (which cleans up the directory when dropped), +works effectively with `rstest`'s fixture model. When a regular (non-`#[once]`) fixture returns a `TempDir` object, or an object that owns it, the resource is typically cleaned up after the test finishes, as the fixture's return value goes out of scope. This localizes resource management logic to the fixture, keeping -the test focused on its assertions. For temporary resources, regular (per-test) -fixtures are generally preferred over `#[once]` fixtures to ensure proper +the test focused on its assertions. For temporary resources, regular (per- +test) fixtures are generally preferred over `#[once]` fixtures to ensure proper cleanup, as `#[once]` fixtures are never dropped. ### B. Mocking External Services (e.g., Database Connections, HTTP APIs) For unit and integration tests that depend on external services like databases or HTTP APIs, mocking is a crucial technique. Mocks allow tests to run in -isolation, without relying on real external systems, making them faster and more -reliable. `rstest` fixtures are an ideal place to encapsulate the setup and -configuration of mock objects. Crates like `mockall` can be used to create +isolation, without relying on real external systems, making them faster and +more reliable. `rstest` fixtures are an ideal place to encapsulate the setup +and configuration of mock objects. Crates like `mockall` can be used to create mocks, or they can be hand-rolled. The fixture would then provide the configured mock instance to the test. General testing advice also strongly recommends mocking external dependencies. The `rstest` documentation itself shows examples @@ -996,20 +995,20 @@ verbose, involving defining expectations, return values, and call counts) from the actual test function. Tests then simply request the configured mock as an argument. If different tests require the mock to behave differently, multiple specialized mock fixtures can be created, or fixture arguments combined with -`#[with(...)]` can be used to dynamically configure the mock's behaviour within +`#[with(…)]` can be used to dynamically configure the mock's behaviour within the fixture itself. This makes tests that depend on external services more readable and maintainable. -### C. Using `#[files(...)]` for Test Input from Filesystem Paths +### C. Using `#[files(…)]` for Test Input from Filesystem Paths For tests that need to process data from multiple input files, `rstest` provides the `#[files("glob_pattern")]` attribute. This attribute can be used on a test function argument to inject file paths that match a given glob pattern. The argument type is typically `PathBuf`. It can also inject file contents directly -as `&str` or `&[u8]` by specifying a mode, e.g., -`#[files("glob_pattern", mode = "str")]`. Additional attributes like -`#[base_dir = "…"]` can specify a base directory for the glob, and -`#[exclude("regex")]` can filter out paths matching a regular expression. +as `&str` or `&[u8]` by specifying a mode, e.g., `#[files("glob_pattern", mode += "str")]`, and additional attributes such as `#[base_dir = "…"]` can specify +a base directory for the glob, and `#[exclude("regex")]` can filter out paths +matching a regular expression. ```rust use rstest::*; @@ -1147,16 +1146,16 @@ potential trade-offs helps in deciding when and how to best utilize it. ### A. `rstest` vs. Standard Rust `#[test]` and Manual Setup -Standard Rust testing using just the `#[test]` attribute is functional but can -become verbose for scenarios involving shared setup or parameterization. +Standard Rust testing using just the `#[test]` attribute is functional but +can become verbose for scenarios involving shared setup or parameterization. `rstest` offers significant improvements in these areas: - **Fixture Management:** With standard `#[test]`, shared setup typically involves calling helper functions manually at the beginning of each test. `rstest` automates this via declarative fixture injection. - **Parameterization:** Achieving table-driven tests with standard `#[test]` - often requires writing loops inside a single test function (which has poor - failure reporting for individual cases) or creating multiple distinct + often requires writing loops inside a single test function (which has + poor failure reporting for individual cases) or creating multiple distinct `#[test]` functions with slight variations. `rstest`'s `#[case]` and `#[values]` attributes provide a much cleaner and more powerful solution. - **Readability and Boilerplate:** `rstest` generally leads to less boilerplate @@ -1168,13 +1167,13 @@ The following table summarizes key differences: **Table 1:** `rstest` **vs. Standard Rust** `#[test]` **for Fixture Management and Parameterization** -| Feature | Standard #[test] Approach | rstest Approach | -| ------------------------------------------------------------- | ------------------------------------------------------------- | -------------------------------------------------------------------------------- | -| Fixture Injection | Manual calls to setup functions within each test. | Fixture name as argument in #[rstest] function; fixture defined with #[fixture]. | -| Parameterized Tests (Specific Cases) | Loop inside one test, or multiple distinct #[test] functions. | #[case(...)] attributes on #[rstest] function. | -| Parameterized Tests (Value Combinations) | Nested loops inside one test, or complex manual generation. | #[values(...)] attributes on arguments of #[rstest] function. | -| Async Fixture Setup | Manual async block and .await calls inside test. | async fn fixtures, with #[future] and #[awt] for ergonomic `.await`ing. | -| Reusing Parameter Sets | Manual duplication of cases or custom helper macros. | rstest_reuse crate with #[template] and #[apply] attributes. | +| Feature | Standard #[test] Approach | rstest Approach | +| ---------------------------------------- | ------------------------------------------------------------- | -------------------------------------------------------------------------------- | +| Fixture Injection | Manual calls to setup functions within each test. | Fixture name as argument in #[rstest] function; fixture defined with #[fixture]. | +| Parameterized Tests (Specific Cases) | Loop inside one test, or multiple distinct #[test] functions. | #[case(…)] attributes on #[rstest] function. | +| Parameterized Tests (Value Combinations) | Nested loops inside one test, or complex manual generation. | #[values(…)] attributes on arguments of #[rstest] function. | +| Async Fixture Setup | Manual async block and .await calls inside test. | async fn fixtures, with #[future] and #[awt] for ergonomic `.await`ing. | +| Reusing Parameter Sets | Manual duplication of cases or custom helper macros. | rstest_reuse crate with #[template] and #[apply] attributes. | This comparison highlights how `rstest`'s attribute-based, declarative approach streamlines common testing patterns, reducing manual effort and improving the @@ -1211,25 +1210,24 @@ mind: macros expand. - **Debugging Parameterized Tests:** `rstest` generates individual test functions for parameterized cases, often named like - `test_function_name::case_N`. Understanding this naming convention is helpful - for identifying and running specific failing cases with - `cargo test test_function_name::case_N`. Some IDEs or debuggers might require - specific configurations or might not fully support stepping through the - macro-generated code as seamlessly as handwritten code, though support is - improving. + `test_function_name::case_N`. Understanding this naming convention is + helpful for identifying and running specific failing cases with `cargo test + test_function_name::case_N`. Some IDEs or debuggers might require specific + configurations or might not fully support stepping through the macro-generated + code as seamlessly as handwritten code, though support is improving. - **Static Nature of Test Cases:** Test cases (e.g., from `#[case]` or `#[files]`) are defined and discovered at compile time. This means the structure of the tests is validated by the Rust compiler, which can catch structural errors (like type mismatches in `#[case]` arguments or references - to non-existent fixtures) earlier than runtime test discovery mechanisms. This - compile-time validation is a strength, offering a degree of static + to non-existent fixtures) earlier than runtime test discovery mechanisms. + This compile-time validation is a strength, offering a degree of static verification for the test suite itself. However, it also means that dynamically generating test cases at runtime based on external factors (not known at compile time) is not directly supported by `rstest`'s core model. - `no_std` **Support:** `rstest` generally relies on the standard library (`std`) being available, as test runners and many common testing utilities - depend on `std`. Therefore, it is typically not suitable for testing - `#![no_std]` libraries in a truly `no_std` test environment where the test + depend on `std`. Therefore, it is typically not suitable for testing `#! + [no_std]` libraries in a truly `no_std` test environment where the test harness itself cannot link `std`. - **Learning Curve:** While designed for simplicity in basic use cases, the full range of attributes and advanced features (e.g., fixture composition, partial @@ -1275,15 +1273,15 @@ are logged under specific conditions. ### C. `test-with`: Conditional Testing with `rstest` -The `test-with` crate allows for conditional execution of tests based on various -runtime conditions, such as the presence of environment variables, the existence -of specific files or folders, or the availability of network services. It can be -used with `rstest`. For example, an `rstest` test could be further annotated -with `test-with` attributes to ensure it only runs if a particular database -configuration file exists or if a dependent web service is reachable. The order -of macros is important: `rstest` should typically generate the test cases first, -and then `test-with` can apply its conditional execution logic to these -generated tests. This allows `rstest` to focus on test structure and data +The `test-with` crate allows for conditional execution of tests based on +various runtime conditions, such as the presence of environment variables, the +existence of specific files or folders, or the availability of network services. +It can be used with `rstest`. For example, an `rstest` test could be further +annotated with `test-with` attributes to ensure it only runs if a particular +database configuration file exists or if a dependent web service is reachable. +The order of macros is important: `rstest` should typically generate the test +cases first, and then `test-with` can apply its conditional execution logic to +these generated tests. This allows `rstest` to focus on test structure and data provision, while `test-with` provides an orthogonal layer of control over test execution conditions. @@ -1299,8 +1297,8 @@ equips developers with the tools to build comprehensive and maintainable test suites. While considerations such as compile-time impact and the learning curve for -advanced features exist, the benefits in terms of cleaner, more robust, and more -expressive tests often outweigh these for projects with non-trivial testing +advanced features exist, the benefits in terms of cleaner, more robust, and +more expressive tests often outweigh these for projects with non-trivial testing requirements. ### A. Recap of `rstest`'s Power for Fixture-Based Testing @@ -1339,16 +1337,16 @@ provided by `rstest`: | ---------------------------- | -------------------------------------------------------------------------------------------- | | #[rstest] | Marks a function as an rstest test; enables fixture injection and parameterization. | | #[fixture] | Defines a function that provides a test fixture (setup data or services). | -| #[case(...)] | Defines a single parameterized test case with specific input values. | -| #[values(...)] | Defines a list of values for an argument, generating tests for each value or combination. | +| #[case(…)] | Defines a single parameterized test case with specific input values. | +| #[values(…)] | Defines a list of values for an argument, generating tests for each value or combination. | | #[once] | Marks a fixture to be initialized only once and shared (as a static reference) across tests. | | #[future] | Simplifies async argument types by removing impl Future boilerplate. | | #[awt] | (Function or argument level) Automatically .awaits future arguments in async tests. | | #[from(original_name)] | Allows renaming an injected fixture argument in the test function. | -| #[with(...)] | Overrides default arguments of a fixture for a specific test. | -| #[default(...)] | Provides default values for arguments within a fixture function. | -| #[timeout(...)] | Sets a timeout for an asynchronous test. | -| #[files("glob_pattern",...)] | Injects file paths (or contents, with mode=) matching a glob pattern as test arguments. | +| #[with(…)] | Overrides default arguments of a fixture for a specific test. | +| #[default(…)] | Provides default values for arguments within a fixture function. | +| #[timeout(…)] | Sets a timeout for an asynchronous test. | +| #[files("glob_pattern",…)] | Injects file paths (or contents, with mode=) matching a glob pattern as test arguments. | By mastering `rstest`, Rust developers can significantly elevate the quality and efficiency of their testing practices, leading to more reliable and maintainable diff --git a/docs/snapshot-testing-in-netsuke-using-insta.md b/docs/snapshot-testing-in-netsuke-using-insta.md index 30b9ab05..e42f8e65 100644 --- a/docs/snapshot-testing-in-netsuke-using-insta.md +++ b/docs/snapshot-testing-in-netsuke-using-insta.md @@ -1,6 +1,14 @@ # Snapshot Testing IR and Ninja Outputs in Netsuke -Snapshot testing with the `insta` crate provides a powerful way to ensure Netsuke’s intermediate representations and generated Ninja build files remain correct over time. According to the Netsuke design, the Intermediate Representation (IR) is a backend-agnostic build graph, and the Ninja file generation is a separate stage built on that IR. We will leverage this separation by writing **separate snapshot tests** for IR and for Ninja output. This guide covers setting up `insta`, organizing test modules and snapshot files, ensuring deterministic outputs, running the tests, and integrating them into a GitHub Actions CI workflow. +Snapshot testing with the `insta` crate provides a powerful way to ensure +Netsuke’s intermediate representations and generated Ninja build files +remain correct over time. According to the Netsuke design, the Intermediate +Representation (IR) is a backend-agnostic build graph, and the Ninja file +generation is a separate stage built on that IR. Leverage this separation by +writing **separate snapshot tests** for IR and for Ninja output. This guide +covers setting up `insta`, organizing test modules and snapshot files, ensuring +deterministic outputs, running the tests, and integrating them into a GitHub +Actions CI workflow. ## Setting Up Insta for Snapshot Testing @@ -11,11 +19,17 @@ First, add `insta` as a development dependency in your **Cargo.toml**: insta = "1" ``` -The `insta` crate provides macros like `assert_snapshot!` (for plain text or `Debug` snapshots) and `assert_yaml_snapshot!`/`assert_json_snapshot!` (for structured snapshots). We will use these macros in our tests. We also install the companion CLI tool `cargo-insta` for reviewing/updating snapshots (useful in CI and local development). +The `insta` crate provides macros like `assert_snapshot!` (for plain text or +`Debug` snapshots) and `assert_yaml_snapshot!`/`assert_json_snapshot!` (for +structured snapshots). Use these macros in tests, and install the companion CLI +tool `cargo-insta` for reviewing or updating snapshots (useful in CI and local +development). -**Project Structure:** We organize the tests in the `tests/` directory, using one module for IR snapshots and another for Ninja snapshots. Each will have its own snapshot output directory for clarity. A possible layout: +**Project Structure:** Organize the tests in the `tests/` directory, using one +module for IR snapshots and another for Ninja snapshots. Each module has its own +snapshot output directory for clarity. A possible layout: -``` +```text netsuke/ ├─ Cargo.toml ├─ src/ @@ -31,11 +45,19 @@ netsuke/ └─ ninja/ (snapshot files for Ninja tests) ``` -By default, `insta` will create a `tests/snapshots` directory and store snapshot data in files named after the test modules. We override this to separate IR and Ninja snapshots into subfolders. This keeps the IR and Ninja expected outputs organized and aligns with Netsuke’s design separation of IR vs. code generation. +By default, `insta` creates a `tests/snapshots` directory and stores snapshot +data in files named after the test modules. This configuration separates IR +and Ninja snapshots into subfolders, keeping the expected outputs organized and +aligning with Netsuke’s design separation of IR from code generation. ## Writing Snapshot Tests for IR Outputs -We create a test module (e.g. **tests/ir_snapshot_tests.rs**) dedicated to IR snapshot tests. Each test will feed a Netsuke manifest (the input build specification) into the compiler’s IR generation stage, then capture the resulting IR in a stable, human-readable form. According to the design, the IR (BuildGraph) is intended to be independent of any particular backend, so we verify it in isolation here. +A dedicated test module (e.g. **tests/ir_snapshot_tests.rs**) contains +IR snapshot tests. Each test feeds a Netsuke manifest (the input build +specification) into the compiler’s IR generation stage and captures the +resulting IR in a stable, human-readable form. According to the design, the IR +(BuildGraph) is intended to be independent of any particular backend, so it is +verified in isolation here. **Example IR Snapshot Test:** @@ -80,29 +102,53 @@ fn simple_manifest_ir_snapshot() { } ``` -In this test, we: - -- Construct a **deterministic** input (a small manifest with a known rule and target). +This test involves: -- Run the IR generation (`BuildGraph::from_manifest`). This function should produce the intermediate build graph. +- Construct a **deterministic** input (a small manifest with a known rule and + target). -- Format the IR in a consistent way for comparison. Here we use pretty-printed debug output (`{:#?}`), but for more complex structures you might implement `Display` or use `assert_yaml_snapshot!` to serialize the IR to YAML/JSON for clarity. +- Run the IR generation (`BuildGraph::from_manifest`). This function should + produce the intermediate build graph. -- Use `Settings::new().set_snapshot_path("tests/snapshots/ir")` to direct the snapshot file to our IR snapshot directory. We then call `assert_snapshot!` with a snapshot name (`"simple_manifest_ir"`) and the IR output string. On first run, `insta` will record this output as the reference snapshot. +- Format the IR consistently for comparison. Pretty-printed debug output + (`{:#?}`) can be used, but for more complex structures implement `Display` or + use `assert_yaml_snapshot!` to serialize the IR to YAML/JSON for clarity. -**Determinism in IR Output:** To ensure consistent snapshots, the IR output must be **deterministic**. This means that given the same manifest input, the IR’s printed form should not vary between test runs or across machines. Pay attention to ordering and ephemeral data: +- Use `Settings::new().set_snapshot_path("tests/snapshots/ir")` to direct the + snapshot file to the IR snapshot directory. Call `assert_snapshot!` with a + snapshot name (`"simple_manifest_ir"`) and the IR output string. On first + run, `insta` will record this output as the reference snapshot. **Determinism + in IR Output:** To ensure consistent snapshots, the IR output must be + **deterministic**. This means that given the same manifest input, the IR’s + printed form should not vary between test runs or across machines. Pay + attention to ordering and ephemeral data: -- **Ordering:** If `BuildGraph` contains collections (e.g. sets of targets or rules), iterate or sort them in a fixed order before printing. Using `BTreeMap` or sorting vectors of targets by name can help. This avoids nondeterministic ordering from hash maps. +- **Ordering:** If `BuildGraph` contains collections (e.g. sets of targets + or rules), iterate or sort them in a fixed order before printing. Using + `BTreeMap` or sorting vectors of targets by name can help. This avoids + nondeterministic ordering from hash maps. -- **Stable Identifiers:** If IR includes IDs or memory addresses, prefer stable identifiers. For example, if you generate rule IDs, assign them in insertion order so they’re consistent, or omit details that can change. +- **Stable Identifiers:** If IR includes IDs or memory addresses, prefer stable + identifiers. For example, when generating rule IDs, assign them in insertion + order so they are consistent, or omit details that can change. -- **No timestamps or environment-specific data:** The IR should not include timestamps, random values, or absolute file system paths. If such data is unavoidable, use `insta` redactions or post-process the output to replace them with placeholders (e.g., ``). +- **No timestamps or environment-specific data:** The IR should not include + timestamps, random values, or absolute file system paths. If such data is + unavoidable, use `insta` redactions or post-process the output to replace them + with placeholders (e.g., ``). -By making the IR snapshot output stable, the snapshot tests will reliably catch regressions. If the IR generation logic changes intentionally (e.g., new fields added), the snapshot will change in a predictable way, prompting a review. +By making the IR snapshot output stable, the snapshot tests will reliably catch +regressions. If the IR generation logic changes intentionally (e.g., new fields +added), the snapshot will change predictably, prompting a review. ## Writing Snapshot Tests for Ninja File Output -Next, we create **tests/ninja_snapshot_tests.rs** to verify Ninja build file generation separately. This stage takes the IR (BuildGraph) and produces a Ninja build script (usually the contents of a `build.ninja` file). Because Netsuke’s design cleanly separates IR building from code generation, we can use the same manifest (or multiple manifest scenarios) to test the Ninja output specifically. +Next, create **tests/ninja_snapshot_tests.rs** to verify Ninja build file +generation separately. This stage takes the IR (BuildGraph) and produces a Ninja +build script (usually the contents of a `build.ninja` file). Because Netsuke’s +design cleanly separates IR building from code generation, it is possible to +use the same manifest (or multiple manifest scenarios) to test the Ninja output +specifically. **Example Ninja Snapshot Test:** @@ -134,7 +180,8 @@ fn simple_manifest_ninja_snapshot() { .expect("Ninja file generation succeeded"); // The output is a multi-line Ninja build script (as a String) - // Ensure it's deterministic (e.g., consistent ordering of rules/targets) + // Ensure the output is deterministic + // (e.g., consistent ordering of rules/targets) Settings::new() .set_snapshot_path("tests/snapshots/ninja") .bind(|| { @@ -145,19 +192,38 @@ fn simple_manifest_ninja_snapshot() { Key points for Ninja snapshot tests: -- We still use a known manifest input and first derive the IR (you could also construct an IR directly for tests, but using the manifest->IR pipeline ensures we’re testing realistic usage). - -- Call the Ninja generation function (e.g. `ninja_gen::generate_ninja`) which produces the Ninja file contents as a `String`. This function should traverse the IR and output rules and build statements in Ninja syntax. - -- As with IR, **determinism is crucial**. The Ninja output should list rules, targets, and dependencies in a consistent order. For example, if the IR doesn’t already preserve order, you may need to sort targets by name or ensure that hashing/deduplication doesn’t cause random order. The design’s approach of consolidating rules by a hash of their properties should still produce the same ordering given the same input, as long as iteration over hashmaps is avoided or stabilized. - -- We again use `Settings::set_snapshot_path` to store these snapshots in a separate `tests/snapshots/ninja` directory. The snapshot name `"simple_manifest_ninja"` identifies this particular scenario. - -With this setup, IR tests and Ninja tests have distinct snapshot files. For example, after running tests the first time (see next section), you might have `tests/snapshots/ir/simple_manifest_ir.snap` and `tests/snapshots/ninja/simple_manifest_ninja.snap` (or combined snapshot files per test module). These snapshot files contain the expected IR debug output and Ninja file text respectively. +- Use a known manifest input and first derive the IR. An IR can also be + constructed directly for tests, but using the manifest→IR pipeline ensures + realistic coverage. + +- Call the Ninja generation function (e.g. `ninja_gen::generate_ninja`), which + produces the Ninja file contents as a `String`. This function traverses the IR + and outputs rules and build statements in Ninja syntax. + +- As with IR, **determinism is crucial**. The Ninja output should list rules, + targets, and dependencies in a consistent order. For example, if the IR does + not preserve order, targets may need to be sorted by name or hashing and + deduplication must avoid randomness. The design’s approach of consolidating + rules by a hash of their properties should still produce the same ordering + given the same input, as long as iteration over hashmaps is avoided or + stabilized. + +- Use `Settings::set_snapshot_path` to store these snapshots in a separate + `tests/snapshots/ninja` directory. The snapshot name `"simple_manifest_ninja"` + identifies this particular scenario. + +With this setup, IR tests and Ninja tests have distinct snapshot files. For +example, after the first test run (see next section), expected snapshot files +include `tests/snapshots/ir/simple_manifest_ir.snap` and `tests/snapshots/ +ninja/simple_manifest_ninja.snap` (or combined snapshot files per test module). +These snapshot files contain the expected IR debug output and Ninja file text +respectively. ## Running and Updating Snapshot Tests -To execute the snapshot tests, run `cargo test`. All tests (including our new snapshot tests) will run. On the first run (or whenever a snapshot differs from expectations), you will see test failures indicating snapshot changes. +To execute the snapshot tests, run `cargo test`. All tests (including our new +snapshot tests) will run. On the first run (or whenever a snapshot differs from +expectations), test failures will indicate snapshot changes. **Example:** @@ -179,29 +245,50 @@ snapshot assertion for `simple_manifest_ninja` failed in "tests/ninja_snapshot_t … ``` -On first run, `insta` (with default `INSTA_UPDATE=auto`) will write new snapshot files for you and mark the tests as failed so you can review them. You’ll find `.snap` files (or `.snap.new` if not auto-approved) in the `tests/snapshots/` subdirectories. +On the first run, `insta` (with default `INSTA_UPDATE=auto`) writes new snapshot +files and marks the tests as failed for review. `.snap` files (or `.snap.new` if +not auto-approved) appear in the `tests/snapshots/` subdirectories. -**Reviewing and Accepting Snapshots:** Use the `cargo-insta` CLI to review and accept these new snapshots: +**Reviewing and Accepting Snapshots:** Use the `cargo-insta` CLI to review and +accept these new snapshots: -- Run `cargo insta review` to interactively inspect differences. This will show a diff between old and new snapshot contents for each test. Since this is the first run, it will just show the new content. +- Run `cargo insta review` to interactively inspect differences. This displays + a diff between old and new snapshot contents for each test. Since this is the + first run, it only shows the new content. -- In the review UI, you can accept the new snapshots (press `A` for all or accept individually). `cargo-insta` will then move the `.snap.new` files to replace the old snapshots (or create the `.snap` files if they didn’t exist). +- Accept the new snapshots using the review interface. `cargo-insta` then moves + the `.snap.new` files to replace the old snapshots or create the `.snap` files + if they did not exist. -- As an alternative, if you are confident in the outputs, you can run `cargo insta accept --all` to accept all changes in one go. +- As an alternative, when confident in the outputs, run `cargo insta accept + --all` to accept all changes in one go. -Once accepted, re-run `cargo test` – it should pass, because the recorded snapshots now match the output. Commit the new/updated `.snap` files to version control. **Always include the snapshot files** so that CI can validate against them. +Once accepted, re-run `cargo test` – it should pass because the recorded +snapshots now match the output. Commit the new/updated `.snap` files to version +control. **Always include the snapshot files** so that CI can validate against +them. -**Deterministic Failures:** If a snapshot test fails unexpectedly in the future, it means the IR or Ninja output changed. This could reveal a regression or a legitimate update: +**Deterministic Failures:** If a snapshot test fails unexpectedly in the future, +it means the IR or Ninja output changed. This could reveal a regression or a +legitimate update: -- If it’s an intended change (e.g., you modified the IR structure or Ninja output format as part of a feature), update the snapshots by reviewing and accepting the changes, and include the updated `.snap` files in your commit. +- For an intended change (e.g., the IR structure or Ninja output format was + updated as part of a feature), review and accept the new snapshots, then + include the updated `.snap` files in the commit. -- If it’s unintended, investigate the differences. Snapshot diffs make it clear what changed (e.g., a rule name, dependency order, etc.), helping pinpoint the issue. +- If it is unintended, investigate the differences. Snapshot diffs make the + change clear (e.g., a rule name, dependency order, etc.) and help pinpoint + the issue. ## Integrating Snapshot Tests into GitHub Actions CI -Automating snapshot tests in CI ensures that changes to Netsuke don’t introduce regressions without notice. We can use GitHub Actions to run `cargo test` (which includes our snapshot tests) on every push or pull request. Here’s how to set it up: +Automating snapshot tests in CI ensures that changes to Netsuke do not introduce +regressions without notice. Use GitHub Actions to run `cargo test` (which +includes the snapshot tests) on every push or pull request. Here’s how to set +it up: -**1. CI Workflow Setup:** In your repository (e.g., `.github/workflows/test.yml`), use a Rust toolchain action and run tests. For example: +**1. CI Workflow Setup:** In the repository (e.g., `.github/workflows/ +test.yml`), use a Rust toolchain action and run tests. For example: ```yaml name: Rust CI @@ -242,36 +329,67 @@ jobs: **Notes:** -- We set `INSTA_UPDATE: no` in CI to disable automatic snapshot creation or updating. This means if a snapshot is missing or differs, the tests will **fail** (as they should in CI). The default `auto` mode already treats CI specially (it won’t auto-accept in CI), but setting `no` is an explicit safeguard. +- Setting `INSTA_UPDATE: no` in CI disables automatic snapshot creation or + updating. If a snapshot is missing or differs, the tests **fail**. The default + `auto` mode already treats CI specially (it will not auto-accept in CI), but + setting `no` is an explicit safeguard. -- We install `cargo-insta` mainly for completeness – running `cargo test` does not strictly require the CLI tool, but having it available can allow using `cargo insta` subcommands in CI if needed (for example, to print a summary or ensure no unused snapshots with `cargo insta test --unreferenced=reject`). +- Install `cargo-insta` mainly for completeness – running `cargo test` does not + strictly require the CLI tool, but its presence enables `cargo insta` + subcommands in CI if needed (for example, to print a summary or ensure no + unused snapshots with `cargo insta test --unreferenced=reject`). -- The caches for Cargo help speed up CI. Ensure you include the snapshot files in the repository so that tests can find the expected outputs. +- The caches for Cargo help speed up CI. Ensure you include the snapshot files + in the repository so that tests can find the expected outputs. -**2. Handling Snapshot Changes in CI:** In a typical workflow, CI will run tests and either pass or fail: +**2. Handling Snapshot Changes in CI:** In a typical workflow, CI will run tests +and either pass or fail: - If all snapshots match, CI passes. No action needed. -- If a snapshot test fails (meaning the IR or Ninja output changed), the CI job fails. Developers should then pull those changes locally, run `cargo insta review`, accept the new snapshot if it’s intended, and commit the updated snapshot file. **Do not automatically accept snapshots in CI** – it’s important to review changes to catch unintended alterations. +- If a snapshot test does not pass (indicating changes in the IR or Ninja + output), the CI job will not succeed. Developers should pull the changes + locally, run `cargo insta review`, accept the new snapshot if it is intended, + and commit the updated snapshot file. **Do not automatically accept snapshots + in CI** – reviewing changes is essential to catch unintended alterations. -You can enhance the CI process by making snapshot reviews easier: +The CI process can be enhanced to make snapshot reviews easier: -- Use `actions/upload-artifact` to upload the `.snap.new` files or diff results when tests fail, so they can be downloaded from the CI logs for inspection. +- Use `actions/upload-artifact` to upload the `.snap.new` files or diff results + when tests fail so they can be downloaded from the CI logs for inspection. -- Or run `cargo insta test --diff` in CI to print diffs to the log for quick viewing of what changed (the `INSTA_OUTPUT` env var can control diff vs summary output). +- Or run `cargo insta test --diff` in CI to print diffs to the log for quick + viewing of what changed (the `INSTA_OUTPUT` env var can control diff vs + summary output). -However, the simplest approach is to let `cargo test` report failures and use those as a signal to update snapshots locally. +However, the simplest approach is to let `cargo test` report failures and use +those as a signal to update snapshots locally. ## Conclusion -By introducing snapshot tests for both the IR and the Ninja output, we adhere to Netsuke’s design principles and gain confidence in each stage of the build process. The IR tests verify that the manifest-to-IR transformation produces the expected build graph structure independently of any output format. The Ninja snapshot tests then verify that the IR-to-Ninja translation is correct. Both sets of tests use deterministic outputs to ensure consistent, meaningful snapshots. - -With the `insta` crate, adding new test cases is straightforward – simply create a manifest (or multiple variants) and assert that the IR or Ninja output matches the snapshot. The snapshot files serve as living documentation of the expected build graph and build script for given scenarios. Integrated into GitHub Actions, this testing framework will catch regressions early: any change in Netsuke’s IR logic or code generation will surface as a snapshot diff, prompting careful review. - -Using this structured snapshot testing approach, you can confidently evolve the Netsuke project while preserving the correctness of its core compilation pipeline. Happy testing! +Introducing snapshot tests for both the IR and the Ninja output adheres to +Netsuke’s design principles and increases confidence in each stage of the build +process. The IR tests verify that the manifest-to-IR transformation produces +the expected build graph structure independently of any output format. The +Ninja snapshot tests then verify that the IR-to-Ninja translation is correct. +Both sets of tests use deterministic outputs to ensure consistent, meaningful +snapshots. + +With the `insta` crate, adding new test cases is straightforward – simply create +a manifest (or multiple variants) and assert that the IR or Ninja output matches +the snapshot. The snapshot files serve as living documentation of the expected +build graph and build script for given scenarios. Integrated into GitHub +Actions, this testing framework will catch regressions early: any change in +Netsuke’s IR logic or code generation will surface as a snapshot diff, prompting +careful review. + +This structured snapshot testing approach enables confident evolution of +the Netsuke project while preserving the correctness of its core compilation +pipeline. **Sources:** - Netsuke Design/Roadmap – separation of IR and Ninja generation -- Insta crate documentation – usage of snapshot assertions and CI integration guidelines +- Insta crate documentation – usage of snapshot assertions and CI integration + guidelines