feat: add list_docs/update_docs/copy_docs methods on NexsetsResource by tariqalagha-nexla · Pull Request #17 · nexla-opensource/nexla-sdk-python

tariqalagha-nexla · 2026-05-09T04:36:28Z

Summary

Wraps GET/POST /data_sets/{id}/docs so callers can read, replace, and copy the rich documentation (DocContainer markdown bodies) attached to nexsets. POST has replace-all semantics.

This unblocks the Express MCP server in veda-ai, which previously couldn't read or write the structured text doc bodies — only the single-line description on the nexset itself. A companion PR in veda-ai exposes these as MCP tools (list_nexla_nexset_docs, update_nexla_nexset_docs, copy_nexla_nexset_docs).

Changes

New methods on NexsetsResource (nexla_sdk/resources/nexsets.py):
- list_docs(set_id, expand=True) — GET /data_sets/{id}/docs?expand=1
- update_docs(set_id, docs) — POST with body {"docs": [...]}, replace-all semantics. Accepts List[DocContainerInput] or List[Dict] to support typed callers and pass-through callers (e.g. MCP).
- copy_docs(src_id, dst_id) — convenience: list from src, strip server-owned fields via round-trip through DocContainerInput, post to dst. No-op if source is empty.
Expanded DocContainer response model from a {id, name} stub to the full shape (id, owner, org, doc_type, public, repo_type, repo_config, text, access_roles, tags, copied_from_id, created_at, updated_at).
New DocContainerInput request model (writable subset).
6 new unit tests covering happy paths, dict-input pass-through, multi-call copy_docs with field-strip assertions, and the empty-source no-op.
Coverage doc updated (docs-site/docs/guides/coverage.mdx).

Notable choices caught during testing

DocContainerInput sets extra="ignore". The SDK BaseModel default is extra="allow", which would have caused copy_docs's response→request round-trip to silently retain server-owned fields (id, owner, org, etc.) and send them back in the POST. The unit test caught it before it leaked to a real POST.
DocContainerInput excludes public and tags. A live POST with public returns 400 "Input cannot include public attribute". tags is on the response but unconfirmed as writable, kept out conservatively. Easy to add later.

Test plan

pytest tests/unit/test_nexsets.py -v — 23/23 pass (6 new + 17 pre-existing)
pytest tests/unit -q — 288/288 pass, no regressions
Live REPL against dataops.nexla.io: c.nexsets.copy_docs(src_id=..., dst_id=...) — succeeded; destination doc verified via subsequent list_docs
End-to-end via Claude Desktop + Express MCP server: prompted "Copy the docs from nexset A to nexset B with this find/replace", which used list_nexla_nexset_docs + update_nexla_nexset_docs and wrote the transformed docs

🤖 Generated with Claude Code

Wraps GET/POST `/data_sets/{id}/docs` so callers can read, replace, and copy the rich documentation (DocContainer markdown bodies) attached to nexsets. POST has replace-all semantics for the nexset's docs. Notable choices caught during testing: - DocContainerInput sets `extra="ignore"` so a full DocContainer response can be round-tripped through it; the SDK BaseModel default is `extra="allow"` which would otherwise leak server-owned fields back to the API in copy_docs. - DocContainerInput excludes `public` and `tags`. The server rejects `public` ("Input cannot include public attribute"); `tags` is on the response but unconfirmed as writable, so kept out conservatively. Can be added later if/when confirmed. Also expands the previously stub DocContainer response model to its real shape (id, owner, org, doc_type, public, repo_type, repo_config, text, access_roles, tags, copied_from_id, created_at, updated_at). Tests: 6 new unit tests in test_nexsets.py covering list/update/copy including dict-input pass-through (for MCP callers) and copy_docs's empty-source no-op + field-stripping behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

saksham-nexla

Review Summary

This is a clean, well-scoped addition. The core design choices are sound and the implementation is consistent with existing SDK patterns.

What's done well:

DocContainerInput with extra="ignore" is exactly the right mechanism for the copy_docs round-trip strip — the PR description even calls out the unit test that caught the issue before it could reach a live POST, which is the right way to develop this kind of model.
The update_docs dual-input type (Union[DocContainerInput, Dict]) is a practical concession for MCP/pass-through callers and is tested explicitly.
Test coverage is thorough: happy paths, dict pass-through, field-strip assertions in copy_docs, and the empty-source no-op are all covered.
DocContainer response model expansion is complete and well-typed (datetime, Optional, Field(default_factory=list) for list fields).
The _serialize_data reuse in update_docs keeps serialisation consistent with the rest of BaseResource.

Concerns (details in inline comments):

[minor] copy_docs can raise an unhandled ValidationError when a source doc has text=None — model_dump(exclude_none=True) silently drops text, which then fails DocContainerInput's required text: str field. Dropping exclude_none=True (field stripping still works via extra="ignore") or pre-filtering None-text docs would fix this.
[nit] Returns: in the update_docs docstring is mis-indented inside the Args: block — Sphinx will render it as part of the docs parameter description.
[nit] DocContainerInput docstring doesn't mention that repo_type/repo_config are also excluded (only public/tags have documented rationale). Minor maintenance debt.
[nit] The silent return [] on non-list API responses in list_docs/update_docs is consistent with the rest of the SDK, but worth tracking as a broader improvement area.

None of these are blockers — the text=None case is an uncommon API edge case, and the rest are documentation nits. Approving; the minor can be addressed in a follow-up or as part of this PR if the author prefers.

Generated by Claude Code

saksham-nexla · 2026-05-09T05:07:22Z

+            Returns:
+                The new list of ``DocContainer`` entries as parsed from the
+                server response.


[nit] Returns: section is mis-indented inside Args:

Issue: The Returns: block (lines 217-219) is indented at 12 spaces — the same level as individual parameter descriptions — rather than 8 spaces (same level as Args:). Sphinx/autodoc and most docstring parsers treat it as a continuation of the docs parameter description rather than a top-level section, so rendered API docs will look broken.

Suggestion: Dedent Returns: and its body by one level:

Args: set_id: Nexset ID. docs: ... Returns: The new list of ``DocContainer`` entries as parsed from the server response.

Generated by Claude Code

saksham-nexla · 2026-05-09T05:07:28Z

+            return []
+        payload = [
+            DocContainerInput.model_validate(
+                doc.model_dump(exclude_none=True)


[minor] copy_docs raises ValidationError when a source doc has text=None

Issue: DocContainer.text is Optional[str], so the API can return null for text. When that happens, model_dump(exclude_none=True) omits text from the dict entirely. DocContainerInput.model_validate(...) then raises a ValidationError ("field required") because DocContainerInput.text: str has no default. The caller gets an unhandled exception mid-copy with no indication of which doc caused it.

Suggestion: Drop exclude_none=True from the model_dump call here. With DocContainerInput's extra="ignore" policy, server-owned fields are still stripped regardless. text=None will then cause a type-validation failure (str doesn't accept None), which is arguably correct — but at least the error message will be clearer. Alternatively, pre-filter docs with None text and document the behaviour:

valid_docs = [doc for doc in source_docs if doc.text is not None] if not valid_docs: return [] payload = [ DocContainerInput.model_validate(doc.model_dump()) for doc in valid_docs ]

Generated by Claude Code

saksham-nexla · 2026-05-09T05:07:32Z

+    ``public``, ``tags``, ``created_at`` etc.) are silently ignored on
+    construction so a full ``DocContainer`` response can be round-tripped
+    through this model to drop fields the API does not accept on input.
+    """


[nit] Docstring doesn't explain why repo_type / repo_config are excluded

Issue: The class docstring lists id, owner, org, public, tags, and created_at as intentionally excluded, but doesn't mention repo_type or repo_config. The PR description documents the reasoning for public and tags specifically; a future maintainer seeing a 400 from a POST with repo_type included will have no breadcrumb here.

Suggestion: Add a brief note (matching the existing PR description rationale) about repo_type / repo_config:

Server-owned and read-only fields (``id``, ``owner``, ``org``, ``public``, ``tags``, ``created_at`` etc.) are silently ignored on construction so a full ``DocContainer`` response can be round-tripped through this model to drop fields the API does not accept on input. ``repo_type`` and ``repo_config`` are also excluded pending confirmation that the API accepts them on write.

Generated by Claude Code

saksham-nexla · 2026-05-09T05:07:37Z

+        response = self._make_request("GET", path, params=params)
+        if isinstance(response, list):
+            return [DocContainer.model_validate(item) for item in response]
+        return []


[nit] Silent return [] on non-list response (same applies to update_docs at line 242)

Issue: If the API ever returns a non-list shape (e.g. a dict or null) for these endpoints — say, during a server-side change or transient error that slips past HTTP-level checking — both list_docs and update_docs silently return []. The caller receives an empty result and has no way to distinguish "no docs" from "unexpected response".

This pattern is consistent with the rest of the SDK (get_accessors, etc.), so it's not blocking. Worth tracking as a broader SDK concern: consider adding a warning log or raising NexlaError when isinstance(response, list) is unexpectedly False.

Generated by Claude Code

tariqalagha-nexla requested a review from saksham-nexla May 9, 2026 05:02

tariqalagha-nexla merged commit 89fdc84 into nexla-opensource:main May 9, 2026
7 checks passed

saksham-nexla approved these changes May 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add list_docs/update_docs/copy_docs methods on NexsetsResource#17

feat: add list_docs/update_docs/copy_docs methods on NexsetsResource#17
tariqalagha-nexla merged 1 commit intonexla-opensource:mainfrom
tariqalagha-nexla:feat/nexset-doc-methods

tariqalagha-nexla commented May 9, 2026

Uh oh!

Uh oh!

saksham-nexla left a comment

Uh oh!

saksham-nexla May 9, 2026

Uh oh!

saksham-nexla May 9, 2026

Uh oh!

saksham-nexla May 9, 2026

Uh oh!

saksham-nexla May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tariqalagha-nexla commented May 9, 2026

Summary

Changes

Notable choices caught during testing

Test plan

Uh oh!

Uh oh!

saksham-nexla left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

saksham-nexla May 9, 2026

Choose a reason for hiding this comment

Uh oh!

saksham-nexla May 9, 2026

Choose a reason for hiding this comment

Uh oh!

saksham-nexla May 9, 2026

Choose a reason for hiding this comment

Uh oh!

saksham-nexla May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants