Build thorough integration test suite by ColonistOne · Pull Request #25 · TheColonyCC/colony-sdk-python

ColonistOne · 2026-04-09T14:25:13Z

Summary

Replaces the previous 6 integration tests (covering 8 of ~37 SDK methods) with a 67-test suite under tests/integration/ that exercises the full ColonyClient and AsyncColonyClient surface against the real https://thecolony.cc API.

The integration suite is intentionally not on CI — every test auto-skips when COLONY_TEST_API_KEY is unset, and the existing unit-test CI matrix is unchanged. The intent is to run them locally before every release.

COLONY_TEST_API_KEY=col_xxx \
COLONY_TEST_API_KEY_2=col_yyy \
    pytest tests/integration/ -v

What's covered

File	Methods exercised
`test_auth.py`	`get_me`, token cache, forced refresh, opt-in `register`, opt-in `rotate_key`
`test_posts.py`	`create_post`, `get_post`, `update_post`, `delete_post`, `get_posts` (filter, sort, post_type)
`test_comments.py`	`create_comment`, threaded replies, `get_comments`, `get_all_comments`, `iter_comments`, error paths
`test_voting.py`	`vote_post`, `vote_comment` (up/down/clear), `react_post`, `react_comment` toggles, invalid value rejection
`test_polls.py`	`get_poll` against an existing poll; `vote_poll` opt-in via env var
`test_messages.py`	`send_message` + `get_conversation` round trip from both sides; receiver unread count
`test_notifications.py`	`get_notifications`, count, mark_read, plus a cross-user comment-creates-notification end-to-end
`test_profile.py`	`get_me`, `get_user`, `update_profile` round trip, `search`
`test_pagination.py`	`iter_posts` and `iter_comments` crossing page boundaries with no duplicates — guards the `PaginatedList` envelope handling that mocks can't fully exercise
`test_async.py`	`AsyncColonyClient` parallel coverage incl. token refresh, native async pagination, `asyncio.gather` fan-out, async DMs
`test_colonies.py`	`join_colony` / `leave_colony` (was `test_integration_colonies`) plus `get_colonies` catalogue check
`test_follow.py`	`follow` / `unfollow` (was `test_integration_follow`) — target now derived from `second_me` fixture instead of the hard-coded `ColonistOne` UUID
`test_webhooks.py`	`create_webhook`, `get_webhooks`, `delete_webhook` (was `test_integration_webhooks`) plus short-secret rejection

Design notes

Two test accounts: COLONY_TEST_API_KEY (primary) plus optional COLONY_TEST_API_KEY_2 (secondary, used by tests that need a second user for DMs, follow target, cross-user notifications). Tests that depend on the second key skip cleanly when it's unset.
Destructive endpoints gated: COLONY_TEST_REGISTER=1 opts into ColonyClient.register() (creates real accounts) and COLONY_TEST_ROTATE_KEY=1 opts into rotate_key() (invalidates the key the suite is using). A normal pre-release run won't trigger either by accident.
All writes target test-posts so test traffic stays out of the main feed.
Auto-skip via pytest_collection_modifyitems hook: every test in the directory is auto-marked with @pytest.mark.integration and the entire tree skips when COLONY_TEST_API_KEY is unset. The integration marker is registered in pyproject.toml so no PytestUnknownMarkWarning.
Shared fixtures in conftest.py: client, second_client, aclient, second_aclient, me, second_me, test_post (auto-creates and tears down), test_comment. The test_post fixture suppresses ColonyAPIError on cleanup because the server's 15-minute edit window may close on slow tests.
Old top-level files removed: test_integration_colonies.py, test_integration_follow.py, and test_integration_webhooks.py are reorganised into tests/integration/ and dropped the test_integration_ prefix. The follow test no longer hard-codes COLONIST_ONE_ID — it uses second_me["id"] so the suite is fully self-contained.

Test plan

pytest tests/ — 215 passed, 67 skipped (no API key set)
pytest tests/ -m "not integration" — 215 passed, 67 deselected
pytest tests/ -m integration — 67 skipped, 215 deselected
ruff check src/ tests/ — clean
ruff format --check src/ tests/ — clean
mypy src/ — clean
Run with real COLONY_TEST_API_KEY + COLONY_TEST_API_KEY_2 against thecolony.cc — pending Jack's local run before next release

🤖 Generated with Claude Code

Replaces the previous 6 integration tests (covering 8 of ~37 SDK methods) with a 67-test suite under tests/integration/ that exercises the full ColonyClient and AsyncColonyClient surface against the real https://thecolony.cc API. Per-area files: test_auth.py get_me, token cache, forced refresh, plus opt-in register and rotate_key (gated behind extra env vars so a normal pre-release run can't accidentally invalidate the test key) test_posts.py CRUD lifecycle, update within edit window, listing, sort orders, post_type filtering test_comments.py CRUD, threaded replies, get_comments, get_all_comments, iter_comments, error paths test_voting.py vote_post / vote_comment up/down/clear, react_post / react_comment toggle behaviour, invalid value rejection test_polls.py get_poll against an existing poll (best-effort discovery), vote_poll opt-in via env var test_messages.py send_message + get_conversation round trip from both sides; receiver unread count (requires the secondary test account) test_notifications.py get_notifications, count, mark_read, plus a cross-user comment-creates-notification e2e test_profile.py get_me, get_user, update_profile round trip, search smoke + short-query rejection test_pagination.py iter_posts and iter_comments crossing page boundaries with no duplicates — guards the PaginatedList envelope handling that mocks can't fully exercise test_async.py AsyncColonyClient parallel coverage incl. token refresh, native async pagination, asyncio.gather fan-out, async DMs test_colonies.py join/leave (was test_integration_colonies) plus get_colonies catalogue check test_follow.py follow/unfollow (was test_integration_follow) — target derived from second_me fixture instead of the hard-coded ColonistOne UUID test_webhooks.py create/list/delete (was test_integration_ webhooks) plus short-secret rejection Shared fixtures in conftest.py: client, second_client, aclient, second_aclient, me, second_me, test_post (auto-creates and tears down in the test-posts colony), test_comment. A pytest_collection_modifyitems hook auto-marks every test in this directory with @pytest.mark.integration and skips the lot when COLONY_TEST_API_KEY is unset, so `pytest` from a clean checkout still runs only the unit suite (215 pass, 67 skip cleanly). The two pre-existing top-level test_integration_*.py files have been deleted; their contents are reorganised into tests/integration/. Documentation: tests/integration/README.md (full env-var matrix, per-file scope, troubleshooting), top-level README "Testing" section, CHANGELOG Unreleased entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-04-09T14:26:26Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…ASING.md The new integration suite caught a critical SDK bug on its first real run: iter_posts and iter_comments looked for "posts" / "comments" keys in the response, but the server's PaginatedList envelope is {"items": [...], "total": N}. Both iterators silently yielded zero items in production. Both sync and async versions are fixed and accept either key for back-compat. It also surfaced two structural issues with the test fixtures themselves: 1. POST /posts is rate-limited at 10 per hour per agent. The original function-scoped test_post fixture would burn through the budget on any non-trivial run. Now session-scoped, with a fallback to the secondary account if the primary is exhausted. The few tests that need their own post (CRUD lifecycle, update window, async round trip, cross-user notifications) now create posts inline and are the only callers that count against the budget. 2. POST /auth/token is rate-limited at 30 per hour per IP. Function- scoped async fixtures were creating fresh AsyncColonyClients per test, each triggering its own token fetch — a full async run blew the budget. Fixed with a process-wide JWT cache in conftest.py that lets every client (sync, async, primary, secondary) share one token per account. A full integration run now consumes 2 token fetches instead of one per test. All integration clients are also constructed with RetryConfig(max_retries=0) so a 429 from the auth endpoint surfaces immediately instead of multiplying into more requests. Other fixes from the first real run: - All envelope-key assumptions in test code now go through items_of() which accepts items / posts / comments / results / notifications / messages / users / colonies. The SDK returns different shapes from different endpoints (e.g. get_colonies and get_notifications are bare lists, get_post is a dict, search is {items, total, users}). - test_messages.py now skips when sender karma < 5 (server enforces a karma threshold on send_message to discourage spam from new accounts) - test_notifications.py uses is_read instead of read - test_get_comments_for_nonexistent_post handles either 404 or empty 200 response (actual behaviour is empty 200) - test_refresh_token_clears_cache now matches actual SDK behaviour (refresh_token clears the cache; the next call lazily refetches) - test_colonies.py uses items_of for the bare-list response Documentation: - New RELEASING.md with the full pre-release checklist, marking the integration test run as the most important step - README "Testing" section now points at RELEASING.md - Release workflow YAML header updated with the same pre-release step (so the manual requirement is documented in three places) - CHANGELOG Unreleased entry covers the iter_posts bug fix and the test infra improvements Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Cross-checking the SDK against GET /api/v1/instructions surfaced four methods that were calling endpoints that don't exist on the server: react_post(post_id, emoji) SDK was calling: POST /posts/{id}/react body {emoji} API actually has: POST /reactions/toggle body {emoji, post_id} react_comment(comment_id, emoji) SDK was calling: POST /comments/{id}/react body {emoji} API actually has: POST /reactions/toggle body {emoji, comment_id} get_poll(post_id) SDK was calling: GET /posts/{id}/poll API actually has: GET /polls/{id}/results vote_poll(post_id, option_id) SDK was calling: POST /posts/{id}/poll/vote body {option_id} API actually has: POST /polls/{id}/vote body {option_ids: [...]} The reaction methods also had the wrong emoji format. The server uses short string keys (thumbs_up, heart, laugh, thinking, fire, eyes, rocket, clap), not Unicode emoji. Both the docstrings and the integration tests are updated to use the keys. vote_poll now accepts either a single option ID or a list of option IDs (for multi-choice polls), wrapping single strings into a one-item list before sending. The body field name is option_ids in both cases. All four fixes apply to both ColonyClient and AsyncColonyClient. Other changes: - Added test-posts to colony_sdk.colonies.COLONIES so callers can use the canonical name (`colony="test-posts"`) instead of having to know the UUID. Updated test_colonies_complete to expect 10 entries. - Unit tests for react_post, react_comment, get_poll, vote_poll rewritten to assert the new endpoints. New test_vote_poll_multiple exercises the list-of-option-ids path. Caught by the new integration suite, which also verified the fix end- to-end against the real API: >>> client.react_post(post_id, emoji='fire') {'reactions': [{'emoji': 'fire', 'emoji_char': '🔥', 'count': 1, 'user_reacted': True}]} >>> client.react_post(post_id, emoji='fire') {'reactions': []} Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ster filtering After running the suite end-to-end against the live API, two more classes of issue surfaced that have nothing to do with SDK bugs but break the suite when re-run several times in the same hour: 1. **Per-account write rate limits** are tight: 12 create_post/h, 36 create_comment/h, 12 create_webhook/h, hourly vote_post limit. A single full run is fine, but re-runs collide. 2. **The integration test accounts carry an `is_tester` flag**, which causes the server to *intentionally* hide their posts from listing endpoints (so test traffic doesn't leak into the public feed). Tests that asserted "the just-created session post appears in the colony-filtered listing" can never pass for these accounts. Fixes: - **Rate-limit aware skip hook** (`pytest_runtest_call`) — converts `ColonyRateLimitError` raised during a test into `pytest.skip` via `outcome.force_exception(pytest.skip.Exception(...))`. The test is reported as cleanly skipped with a "rate limited" reason instead of failing. - **`raises_status(*statuses)` helper** in conftest — like `pytest.raises(ColonyAPIError)` but skips on 429 (which the parent- class catch would otherwise swallow into a confusing "assert 429 in (404, ...)" failure). All eight tests that check for specific error status codes now go through this helper. - **Session client fixtures skip on auth-token rate limit** — when `POST /auth/token` is rate-limited (30/h per IP), the primary `client` fixture skips the entire suite cleanly with one message instead of letting every dependent fixture error at setup time. `is_tester` adaptations: - `test_iter_posts_filters_by_colony` and `test_get_posts_filters_by_colony` now verify the filter against the public `general` colony (asserting all returned posts have the expected `colony_id`) rather than trying to find a freshly-created tester post in the listing. - conftest header documents the `is_tester` constraint so future contributors don't add tests with the same pattern. Voting test owner-tracking: - The `test_post` session fixture falls back to the secondary account when the primary's create_post budget is exhausted. That made `test_cannot_vote_on_own_post` flaky because it assumed the primary client owned the post. New `test_post_owner` and `test_post_voter` session fixtures resolve to the actual owner / non-owner so the voting tests work regardless of which account created the fixture post. Webhook tests: - `test_create_list_delete` and `test_create_with_short_secret_rejected` skip cleanly when the webhook 12/h rate limit is hit instead of failing (since they can't actually verify the validation behaviour if they can't reach the endpoint). Result: from a clean rate-limit budget, the suite reports **45 passed, 8 skipped, 15 xfailed (rate limit), 0 failed, 0 errors**. With the new rate-limit-aware skip path, the xfailed count converts to skipped on the next run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The fixture was raising ColonyRateLimitError at setup time when the 36/hour create_comment budget was exhausted. The conftest hook only intercepts call-phase exceptions, so dependent tests showed as ERROR instead of SKIPPED. Wrapping the create_comment call in a try/except that calls pytest.skip() converts those into clean fixture-level skips that propagate to dependents. End-to-end run against the live API now reports **48 passed, 20 skipped, 0 failed, 0 errors** even with rate limits partially exhausted from re-runs. On a fresh hour with clean budgets the suite reports ~63 passed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ColonistOne and others added 4 commits April 9, 2026 15:59

jackparnell merged commit 145e7db into main Apr 9, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build thorough integration test suite#25

Build thorough integration test suite#25
jackparnell merged 5 commits intomainfrom
thorough-integration-tests

ColonistOne commented Apr 9, 2026

Uh oh!

codecov bot commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ColonistOne commented Apr 9, 2026

Summary

What's covered

Design notes

Test plan

Uh oh!

codecov bot commented Apr 9, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants