Merged
Conversation
Replaces the previous 6 integration tests (covering 8 of ~37 SDK methods) with a 67-test suite under tests/integration/ that exercises the full ColonyClient and AsyncColonyClient surface against the real https://thecolony.cc API. Per-area files: test_auth.py get_me, token cache, forced refresh, plus opt-in register and rotate_key (gated behind extra env vars so a normal pre-release run can't accidentally invalidate the test key) test_posts.py CRUD lifecycle, update within edit window, listing, sort orders, post_type filtering test_comments.py CRUD, threaded replies, get_comments, get_all_comments, iter_comments, error paths test_voting.py vote_post / vote_comment up/down/clear, react_post / react_comment toggle behaviour, invalid value rejection test_polls.py get_poll against an existing poll (best-effort discovery), vote_poll opt-in via env var test_messages.py send_message + get_conversation round trip from both sides; receiver unread count (requires the secondary test account) test_notifications.py get_notifications, count, mark_read, plus a cross-user comment-creates-notification e2e test_profile.py get_me, get_user, update_profile round trip, search smoke + short-query rejection test_pagination.py iter_posts and iter_comments crossing page boundaries with no duplicates — guards the PaginatedList envelope handling that mocks can't fully exercise test_async.py AsyncColonyClient parallel coverage incl. token refresh, native async pagination, asyncio.gather fan-out, async DMs test_colonies.py join/leave (was test_integration_colonies) plus get_colonies catalogue check test_follow.py follow/unfollow (was test_integration_follow) — target derived from second_me fixture instead of the hard-coded ColonistOne UUID test_webhooks.py create/list/delete (was test_integration_ webhooks) plus short-secret rejection Shared fixtures in conftest.py: client, second_client, aclient, second_aclient, me, second_me, test_post (auto-creates and tears down in the test-posts colony), test_comment. A pytest_collection_modifyitems hook auto-marks every test in this directory with @pytest.mark.integration and skips the lot when COLONY_TEST_API_KEY is unset, so `pytest` from a clean checkout still runs only the unit suite (215 pass, 67 skip cleanly). The two pre-existing top-level test_integration_*.py files have been deleted; their contents are reorganised into tests/integration/. Documentation: tests/integration/README.md (full env-var matrix, per-file scope, troubleshooting), top-level README "Testing" section, CHANGELOG Unreleased entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…ASING.md
The new integration suite caught a critical SDK bug on its first real
run: iter_posts and iter_comments looked for "posts" / "comments" keys
in the response, but the server's PaginatedList envelope is
{"items": [...], "total": N}. Both iterators silently yielded zero
items in production. Both sync and async versions are fixed and
accept either key for back-compat.
It also surfaced two structural issues with the test fixtures themselves:
1. POST /posts is rate-limited at 10 per hour per agent. The original
function-scoped test_post fixture would burn through the budget on
any non-trivial run. Now session-scoped, with a fallback to the
secondary account if the primary is exhausted. The few tests that
need their own post (CRUD lifecycle, update window, async round
trip, cross-user notifications) now create posts inline and are
the only callers that count against the budget.
2. POST /auth/token is rate-limited at 30 per hour per IP. Function-
scoped async fixtures were creating fresh AsyncColonyClients per
test, each triggering its own token fetch — a full async run blew
the budget. Fixed with a process-wide JWT cache in conftest.py
that lets every client (sync, async, primary, secondary) share
one token per account. A full integration run now consumes 2
token fetches instead of one per test.
All integration clients are also constructed with
RetryConfig(max_retries=0) so a 429 from the auth endpoint surfaces
immediately instead of multiplying into more requests.
Other fixes from the first real run:
- All envelope-key assumptions in test code now go through items_of()
which accepts items / posts / comments / results / notifications /
messages / users / colonies. The SDK returns different shapes from
different endpoints (e.g. get_colonies and get_notifications are
bare lists, get_post is a dict, search is {items, total, users}).
- test_messages.py now skips when sender karma < 5 (server enforces a
karma threshold on send_message to discourage spam from new accounts)
- test_notifications.py uses is_read instead of read
- test_get_comments_for_nonexistent_post handles either 404 or empty
200 response (actual behaviour is empty 200)
- test_refresh_token_clears_cache now matches actual SDK behaviour
(refresh_token clears the cache; the next call lazily refetches)
- test_colonies.py uses items_of for the bare-list response
Documentation:
- New RELEASING.md with the full pre-release checklist, marking the
integration test run as the most important step
- README "Testing" section now points at RELEASING.md
- Release workflow YAML header updated with the same pre-release step
(so the manual requirement is documented in three places)
- CHANGELOG Unreleased entry covers the iter_posts bug fix and the
test infra improvements
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cross-checking the SDK against GET /api/v1/instructions surfaced four
methods that were calling endpoints that don't exist on the server:
react_post(post_id, emoji)
SDK was calling: POST /posts/{id}/react body {emoji}
API actually has: POST /reactions/toggle body {emoji, post_id}
react_comment(comment_id, emoji)
SDK was calling: POST /comments/{id}/react body {emoji}
API actually has: POST /reactions/toggle body {emoji, comment_id}
get_poll(post_id)
SDK was calling: GET /posts/{id}/poll
API actually has: GET /polls/{id}/results
vote_poll(post_id, option_id)
SDK was calling: POST /posts/{id}/poll/vote body {option_id}
API actually has: POST /polls/{id}/vote body {option_ids: [...]}
The reaction methods also had the wrong emoji format. The server uses
short string keys (thumbs_up, heart, laugh, thinking, fire, eyes,
rocket, clap), not Unicode emoji. Both the docstrings and the
integration tests are updated to use the keys.
vote_poll now accepts either a single option ID or a list of option
IDs (for multi-choice polls), wrapping single strings into a one-item
list before sending. The body field name is option_ids in both cases.
All four fixes apply to both ColonyClient and AsyncColonyClient.
Other changes:
- Added test-posts to colony_sdk.colonies.COLONIES so callers can use
the canonical name (`colony="test-posts"`) instead of having to know
the UUID. Updated test_colonies_complete to expect 10 entries.
- Unit tests for react_post, react_comment, get_poll, vote_poll
rewritten to assert the new endpoints. New test_vote_poll_multiple
exercises the list-of-option-ids path.
Caught by the new integration suite, which also verified the fix end-
to-end against the real API:
>>> client.react_post(post_id, emoji='fire')
{'reactions': [{'emoji': 'fire', 'emoji_char': '🔥', 'count': 1, 'user_reacted': True}]}
>>> client.react_post(post_id, emoji='fire')
{'reactions': []}
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ster filtering After running the suite end-to-end against the live API, two more classes of issue surfaced that have nothing to do with SDK bugs but break the suite when re-run several times in the same hour: 1. **Per-account write rate limits** are tight: 12 create_post/h, 36 create_comment/h, 12 create_webhook/h, hourly vote_post limit. A single full run is fine, but re-runs collide. 2. **The integration test accounts carry an `is_tester` flag**, which causes the server to *intentionally* hide their posts from listing endpoints (so test traffic doesn't leak into the public feed). Tests that asserted "the just-created session post appears in the colony-filtered listing" can never pass for these accounts. Fixes: - **Rate-limit aware skip hook** (`pytest_runtest_call`) — converts `ColonyRateLimitError` raised during a test into `pytest.skip` via `outcome.force_exception(pytest.skip.Exception(...))`. The test is reported as cleanly skipped with a "rate limited" reason instead of failing. - **`raises_status(*statuses)` helper** in conftest — like `pytest.raises(ColonyAPIError)` but skips on 429 (which the parent- class catch would otherwise swallow into a confusing "assert 429 in (404, ...)" failure). All eight tests that check for specific error status codes now go through this helper. - **Session client fixtures skip on auth-token rate limit** — when `POST /auth/token` is rate-limited (30/h per IP), the primary `client` fixture skips the entire suite cleanly with one message instead of letting every dependent fixture error at setup time. `is_tester` adaptations: - `test_iter_posts_filters_by_colony` and `test_get_posts_filters_by_colony` now verify the filter against the public `general` colony (asserting all returned posts have the expected `colony_id`) rather than trying to find a freshly-created tester post in the listing. - conftest header documents the `is_tester` constraint so future contributors don't add tests with the same pattern. Voting test owner-tracking: - The `test_post` session fixture falls back to the secondary account when the primary's create_post budget is exhausted. That made `test_cannot_vote_on_own_post` flaky because it assumed the primary client owned the post. New `test_post_owner` and `test_post_voter` session fixtures resolve to the actual owner / non-owner so the voting tests work regardless of which account created the fixture post. Webhook tests: - `test_create_list_delete` and `test_create_with_short_secret_rejected` skip cleanly when the webhook 12/h rate limit is hit instead of failing (since they can't actually verify the validation behaviour if they can't reach the endpoint). Result: from a clean rate-limit budget, the suite reports **45 passed, 8 skipped, 15 xfailed (rate limit), 0 failed, 0 errors**. With the new rate-limit-aware skip path, the xfailed count converts to skipped on the next run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The fixture was raising ColonyRateLimitError at setup time when the 36/hour create_comment budget was exhausted. The conftest hook only intercepts call-phase exceptions, so dependent tests showed as ERROR instead of SKIPPED. Wrapping the create_comment call in a try/except that calls pytest.skip() converts those into clean fixture-level skips that propagate to dependents. End-to-end run against the live API now reports **48 passed, 20 skipped, 0 failed, 0 errors** even with rate limits partially exhausted from re-runs. On a fresh hour with clean budgets the suite reports ~63 passed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the previous 6 integration tests (covering 8 of ~37 SDK methods) with a 67-test suite under
tests/integration/that exercises the fullColonyClientandAsyncColonyClientsurface against the realhttps://thecolony.ccAPI.The integration suite is intentionally not on CI — every test auto-skips when
COLONY_TEST_API_KEYis unset, and the existing unit-test CI matrix is unchanged. The intent is to run them locally before every release.COLONY_TEST_API_KEY=col_xxx \ COLONY_TEST_API_KEY_2=col_yyy \ pytest tests/integration/ -vWhat's covered
test_auth.pyget_me, token cache, forced refresh, opt-inregister, opt-inrotate_keytest_posts.pycreate_post,get_post,update_post,delete_post,get_posts(filter, sort, post_type)test_comments.pycreate_comment, threaded replies,get_comments,get_all_comments,iter_comments, error pathstest_voting.pyvote_post,vote_comment(up/down/clear),react_post,react_commenttoggles, invalid value rejectiontest_polls.pyget_pollagainst an existing poll;vote_pollopt-in via env vartest_messages.pysend_message+get_conversationround trip from both sides; receiver unread counttest_notifications.pyget_notifications, count, mark_read, plus a cross-user comment-creates-notification end-to-endtest_profile.pyget_me,get_user,update_profileround trip,searchtest_pagination.pyiter_postsanditer_commentscrossing page boundaries with no duplicates — guards thePaginatedListenvelope handling that mocks can't fully exercisetest_async.pyAsyncColonyClientparallel coverage incl. token refresh, native async pagination,asyncio.gatherfan-out, async DMstest_colonies.pyjoin_colony/leave_colony(wastest_integration_colonies) plusget_coloniescatalogue checktest_follow.pyfollow/unfollow(wastest_integration_follow) — target now derived fromsecond_mefixture instead of the hard-codedColonistOneUUIDtest_webhooks.pycreate_webhook,get_webhooks,delete_webhook(wastest_integration_webhooks) plus short-secret rejectionDesign notes
COLONY_TEST_API_KEY(primary) plus optionalCOLONY_TEST_API_KEY_2(secondary, used by tests that need a second user for DMs, follow target, cross-user notifications). Tests that depend on the second key skip cleanly when it's unset.COLONY_TEST_REGISTER=1opts intoColonyClient.register()(creates real accounts) andCOLONY_TEST_ROTATE_KEY=1opts intorotate_key()(invalidates the key the suite is using). A normal pre-release run won't trigger either by accident.test-postsso test traffic stays out of the main feed.pytest_collection_modifyitemshook: every test in the directory is auto-marked with@pytest.mark.integrationand the entire tree skips whenCOLONY_TEST_API_KEYis unset. Theintegrationmarker is registered inpyproject.tomlso noPytestUnknownMarkWarning.conftest.py:client,second_client,aclient,second_aclient,me,second_me,test_post(auto-creates and tears down),test_comment. Thetest_postfixture suppressesColonyAPIErroron cleanup because the server's 15-minute edit window may close on slow tests.test_integration_colonies.py,test_integration_follow.py, andtest_integration_webhooks.pyare reorganised intotests/integration/and dropped thetest_integration_prefix. The follow test no longer hard-codesCOLONIST_ONE_ID— it usessecond_me["id"]so the suite is fully self-contained.Test plan
pytest tests/— 215 passed, 67 skipped (no API key set)pytest tests/ -m "not integration"— 215 passed, 67 deselectedpytest tests/ -m integration— 67 skipped, 215 deselectedruff check src/ tests/— cleanruff format --check src/ tests/— cleanmypy src/— cleanCOLONY_TEST_API_KEY+COLONY_TEST_API_KEY_2against thecolony.cc — pending Jack's local run before next release🤖 Generated with Claude Code