From 18d98745f8a68a27c54aa0ddc6c3e5b22f38f6e0 Mon Sep 17 00:00:00 2001 From: Gokul-social Date: Tue, 24 Feb 2026 14:56:22 +0530 Subject: [PATCH 1/2] docs: restore original nav order, avoid unintended reordering --- mkdocs.yml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mkdocs.yml b/mkdocs.yml index d174e474..8587ea7e 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -13,14 +13,16 @@ theme: - content.code.copy nav: - - Intro: index.md + - OpenML Server: index.md - Getting Started: installation.md + - Contributing: - contributing/index.md - Development: contributing/contributing.md - Tests: contributing/tests.md - Documentation: contributing/documentation.md - Project Overview: contributing/project_overview.md + - Changes: migration.md markdown_extensions: From 4de5d384aeac014f561044444d7c06c4aac9935f Mon Sep 17 00:00:00 2001 From: Gokul-social Date: Wed, 25 Feb 2026 23:19:56 +0530 Subject: [PATCH 2/2] docs: rewrite tests.md with structured developer documentation Addresses #169. Replaces informal benchmark notes with a structured guide covering the four existing test categories (migration, integration, direct database, direct function), performance tradeoffs, mocking philosophy, and running instructions. --- docs/contributing/tests.md | 212 +++++++++++++++++++++---------------- 1 file changed, 119 insertions(+), 93 deletions(-) diff --git a/docs/contributing/tests.md b/docs/contributing/tests.md index 5886d9e1..c061215d 100644 --- a/docs/contributing/tests.md +++ b/docs/contributing/tests.md @@ -1,98 +1,124 @@ # Writing Tests -tl;dr: - - Setting up the `py_api` fixture to test directly against a REST API endpoint is really slow, only use it for migration/integration tests. - - Getting a database fixture and doing a database call is slow, consider mocking if appropriate. - -## Overhead from Fixtures -Sometimes, you want to interact with the REST API through the `py_api` fixture, -or want access to a database with `user_test` or `expdb_test` fixtures. -Be warned that these come with considerable relative overhead, which adds up when running thousands of tests. - -```python -@pytest.mark.parametrize('execution_number', range(5000)) -def test_private_dataset_owner_access( - execution_number, - expdb_test: Connection, - user_test: Connection, - py_api: TestClient, -) -> None: - fetch_user(ApiKey.REGULAR_USER, user_test) # accesses only the user db - get_estimation_procedures(expdb_test) # accesses only the experiment db - py_api.get("/does/not/exist") # only queries the api - pass +This page documents the current testing strategy in this repository. +It is intentionally descriptive: it explains which test layers exist today and when each layer is used. + +## Quick summary + +- Use the lightest test layer that verifies the behavior you are changing. +- `py_api` (`fastapi.testclient.TestClient`) is intentionally used for integration and migration checks. +- Direct database tests verify SQL/database behavior. +- Direct function tests verify application logic with minimal fixture overhead. +- Mocking is used selectively to keep tests fast, while still validating real database behavior in dedicated tests. + +## Test infrastructure in this repository + +The core fixtures are defined in `tests/conftest.py`: + +- `expdb_test` and `user_test` provide transactional database connections. +- `py_api` creates a FastAPI `TestClient` and overrides dependencies to use those transactional connections. +- `php_api` provides an HTTP client to the legacy PHP API for migration comparisons. + +The transactional fixtures use rollback semantics, so most tests can mutate data without persisting changes. + +## Test categories + +### 1) Migration tests + +Migration tests compare Python API responses against the legacy PHP API for equivalent endpoints. +These tests live under `tests/routers/openml/migration/`. + +Characteristics: + +- Use both `py_api` and `php_api` fixtures. +- Compare response status and response body (with explicit normalization where old/new formats differ). +- Focus on compatibility guarantees during migration. + +Typical examples include dataset, flow, task, study, and evaluation migration checks. + +### 2) Integration tests (FastAPI TestClient) + +Integration tests call Python API endpoints through `py_api` and assert end-to-end behavior from routing to serialization. +Most endpoint-focused tests under `tests/routers/openml/` use this style. + +Characteristics: + +- Exercise request/response handling via HTTP calls to the in-process FastAPI app. +- Use real dependency wiring (with test database connections injected via fixture overrides). +- Validate returned status codes and payloads as clients see them. + +This layer is broader than direct function/database tests, but also has higher execution cost. + +### 3) Direct database tests + +Direct database tests call functions in `src/database/*` with `expdb_test`/`user_test` connections. +Examples are in `tests/database/`. + +Characteristics: + +- Focus on query behavior and returned records. +- Avoid HTTP/TestClient overhead. +- Validate persistence-layer behavior directly against the test database. + +Use this layer when the change is primarily in SQL access or data retrieval logic. + +### 4) Direct function tests + +Direct function tests call router or dependency functions directly (without HTTP requests), often with lightweight fixtures and selective mocks. +Examples include tests that call functions such as `flow_exists(...)` or `get_dataset(...)` directly. + +Characteristics: + +- Validate function-level control flow and error handling. +- Can mock lower-level calls where appropriate. +- Keep runtime low compared with full TestClient tests. + +These tests are useful for fast feedback on logic that does not require full HTTP-level verification. + +## Performance tradeoffs + +Fixture setup has measurable cost. +In existing measurements, creating `py_api` is significantly more expensive than direct function/database-level testing, and database fixtures also add overhead. + +Practical implications: + +- Prefer direct function or direct database tests when they can validate the behavior sufficiently. +- Reserve `py_api` usage for cases where endpoint-level integration behavior is the target. +- Keep migration tests focused, because they combine multiple expensive dependencies. + +This keeps local feedback cycles fast while preserving endpoint and compatibility coverage where required. + +## Design philosophy: limited mocking + +Mocking is used to reduce runtime and isolate logic when full database interaction is not required. +At the same time, this repository keeps mocking limited by pairing it with real database coverage for the same entities/paths. + +Why this balance is used: + +- Mock-based tests are fast and targeted. +- Database-backed tests verify actual query/schema behavior. +- Together they reduce risk that mocked behavior diverges from real database behavior. + +In short: mock for speed and focus, but keep real database tests for behavioral truth. + +## Running tests + +Run all tests (from the Python API container): + +```bash +python -m pytest tests ``` -When individually adding/removing components, we measure (for 5000 repeats, n=1): - -| expdb | user | api | exp call | user call | api get | time (s) | -|-------|------|-----|----------|-----------|---------|----------:| -| ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | 1.78 | -| ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 3.45 | -| ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | 3.22 | -| ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | 298.48 | -| ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | 4.44 | -| ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | 285.69 | -| ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | 4.91 | -| ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | 5.81 | -| ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 307.91 | - -Adding a fixture that just returns some value adds only minimal overhead (1.91s), -so the burden comes from establishing the database connection itself. - -We make the following observations: - -- Adding a database fixture adds the same overhead as instantiating an entirely new test. -- Overhead of adding multiple database fixtures is not free -- The `py_api` fixture adds two orders of magnitude more overhead - -We want our tests to be fast, so we want to avoid using these fixtures when we reasonably can. -We restrict usage of `py_api` fixtures to integration/migration tests, since it is very slow. -These only run on CI before merges. -For database fixtures - -We will write some fixtures that can be used to e.g., get a `User` without accessing the database. -The validity of these users will be tested against the database in only a single test. - -### Mocking -Mocking can help us reduce the reliance on database connections in tests. -A mocked function can prevent accessing the database, and instead return a predefined value instead. - -It has a few upsides: - - It's faster than using a database fixture (see below). - - The test is not dependent on the database: you can run the test without a database. - -But it also has downsides: - - Behavior changes in the database, such as schema changes, are not automatically reflected in the tests. - - The database layer (e.g., queries) are not actually tested. - -Basically, the mocked behavior may not match real behavior when executed on a database. -For this reason, for each mocked entity, we should add a test that verifies that if the database layer -is invoked with the database, it returns the expected output that matches the mock. -This is additional overhead in development, but hopefully it pays back in more granular test feedback and faster tests. - -On the speed of mocks, consider these two tests: - -```diff -@pytest.mark.parametrize('execution_number', range(5000)) -def test_private_dataset_owner_access( - execution_number, - admin, -+ mocker, -- expdb_test: Connection, -) -> None: -+ mock = mocker.patch('database.datasets.get') -+ class Dataset(NamedTuple): -+ uploader: int -+ visibility: Visibility -+ mock.return_value = Dataset(uploader=1, visibility=Visibility.PRIVATE) - - _get_dataset_raise_otherwise( - dataset_id=1, - user=admin, -- expdb=expdb_test, -+ expdb=None, - ) +Run a focused test module: + +```bash +python -m pytest tests/routers/openml/datasets_test.py ``` -There is only a single database call in the test. It fetches a record on an indexed field and does not require any joins. -Despite the database call being very light, the database-included test is ~50% slower than the mocked version (3.50s vs 5.04s). + +Run by marker expression (example): + +```bash +python -m pytest -m "not slow" +``` + +See `pyproject.toml` for current marker definitions (including `slow` and `mut`).