Skip to content

Conversation

@dodeja
Copy link
Contributor

@dodeja dodeja commented Oct 22, 2025

Summary

  • add a docs/openapi/README.md that documents the modular authoring workflow, linting, and bundling commands
  • introduce a GitHub Actions workflow that lints the OpenAPI tree with Spectral and runs the bundling regression test on changes

Testing

  • python -m unittest discover tests

https://chatgpt.com/codex/tasks/task_b_68f863d34bfc8322a40d3a48c68b8962

@macroscopeapp
Copy link

macroscopeapp bot commented Oct 22, 2025

Document OpenAPI workflow and add CI linting for modular API specs and examples

📍Where to Start

Start with the OpenAPI root document at index.yaml to see paths and components, then review the bundling and validation logic in openapi_bundle.py and the CI workflow in openapi-validation.yml.

Changes since #145 opened

  • Added ignore patterns to version control configuration [277f7fb]
  • Migrated OpenAPI validation pipeline from Spectral and Redocly to Mintlify CLI [a180701]
  • Enhanced Python bundler robustness and pre-commit hook resilience [a180701]
  • Standardized development environment toolchain management [a180701]

📊 Macroscope summarized a180701. 4 files reviewed, 21 issues evaluated, 18 issues filtered, 1 comment posted

🗂️ Filtered Issues

scripts/pre-commit.sh — 0 comments posted, 3 evaluated, 3 filtered
  • line 28: The detection of modified OpenAPI sources only matches .yaml files using grep -q "^docs/openapi/.*\.yaml$" (lines 28–31). If the repo contains .yml files (a common alternative extension), changes to those will not trigger a bundle regeneration, potentially leaving docs/openapi.json out of sync. Consider matching both .yaml and .yml or explicitly rejecting .yml with a clear error to maintain consistency. [ Code style ]
  • line 36: Detecting whether the regenerated bundle changed uses git diff --name-only docs/openapi.json | grep -q "openapi.json" (lines 36–42), which only reports changes for tracked files. If docs/openapi.json is currently untracked (e.g., first creation or removed from the index), this check will not detect the change, and git add docs/openapi.json will not be performed, causing the commit to miss the newly generated bundle. Fix by ensuring the file is added when it exists and differs, even if untracked: for example, check file existence and always git add docs/openapi.json after regeneration, or use git status --porcelain -- docs/openapi.json to detect untracked/modified states. [ Already posted ]
  • line 64: The sync verification compares the working tree file docs/openapi.json instead of the staged version. In a pre-commit hook, validation must check the staged content to ensure what is being committed is correct. Using diff -q docs/openapi.json "$TEMP_BUNDLE" (lines 63–69) can produce false positives/negatives: if the user staged an earlier version and then edited the file without re-staging, the hook may reject or accept incorrectly. Fix by comparing against the staged blob (e.g., git show :docs/openapi.json or git cat-file -p :docs/openapi.json) or by using git diff --cached --quiet -- docs/openapi.json against the freshly generated bundle contents. [ Already posted ]
scripts/split_openapi.py — 0 comments posted, 5 evaluated, 5 filtered
  • line 49: Unvalidated JSON read can crash the script: json.loads(OPENAPI_JSON.read_text()) assumes docs/openapi.json exists and contains valid JSON. If the file is missing or unreadable, Path.read_text() raises an exception; if the content is not valid JSON, json.loads(...) raises JSONDecodeError. No error handling is present, so the process terminates mid-run after directories may have been created, leaving partial state. Add explicit existence checks and catch read/parse errors to fail with a clear message and avoid partial writes. [ Low confidence ]
  • line 58: Type assumptions on paths, components.schemas, and components.securitySchemes: The loops call .items() on data.get("paths", {}), data.get("components", {}).get("schemas", {}), and data.get("components", {}).get("securitySchemes", {}). If any of these values exist but are not dicts (e.g., null or lists due to malformed input), .items() will raise AttributeError. Add explicit type checks (e.g., isinstance(..., dict)) or normalize to {} when values are not dicts to ensure the script fails gracefully. [ Low confidence ]
  • line 60: Silent overwriting of existing output files: The script writes YAML files with deterministic names (e.g., root.yaml, {base}.yaml) without checking for existing files in docs/openapi/.... Re-running the script against a changed input can overwrite prior outputs without warning, potentially losing manual edits. Consider warning or failing when overwriting, or writing to a clean output directory. [ Code style ]
  • line 60: Potential runtime failure in write_yaml due to dependency on undefined dump_yaml: The imported write_yaml calls dump_yaml(data) according to tools/openapi_yaml.py. If dump_yaml is not defined in that module at runtime, any call to write_yaml(PATHS_DIR / filename, definition) will raise NameError. Ensure dump_yaml is defined/imported in tools/openapi_yaml.py, or replace with a safe serializer. [ Low confidence ]
  • line 78: Index may emit invalid OpenAPI when openapi or info are missing: The index dict sets "openapi": data.get("openapi") and "info": data.get("info"). If those keys are absent or falsy, the values None will be serialized into YAML as null, yielding an invalid OpenAPI document (both fields are required and must be specific types). Add validation to ensure these fields exist and have correct types before writing, and fail with a clear error otherwise. [ Low confidence ]
tools/openapi_bundle.py — 1 comment posted, 7 evaluated, 5 filtered
  • line 71: JSON Pointer fragments in $ref are not decoded per the spec. In _resolve, when a fragment is present it is split on '/' and each part is matched literally as a dict key, but JSON Pointer requires unescaping of ~1 to / and ~0 to ~. Without unescaping, valid references like #/components/schemas/Foo~1Bar or keys containing tildes will fail to resolve and incorrectly raise BundleError or leave unresolved references. [ Already posted ]
  • line 73: JSON Pointer array indices are not supported during fragment resolution. In _resolve, when descending into a fragment, only dictionaries are handled. If a fragment points into a list (e.g., #/paths/~1users/get/parameters/0), it should interpret numeric parts as list indices. Currently, any such reference will incorrectly raise BundleError as 'Fragment ... not found'. [ Already posted ]
  • line 104: Schema validation rejects valid JSON Schema boolean schemas. In _validate_schema_file, the function raises if data is not a dict. JSON Schema (and OpenAPI 3.1 which adopts it) allows a schema to be the boolean true or false. Such files would be incorrectly rejected with ValidationError. [ Low confidence ]
  • line 116: Schema validation rejects otherwise valid schema files that only contain $defs (or similar definition containers) at the top level. The check if not any(field in data for field in expected_fields) will fail when a file is intended solely as a container of reusable definitions and relies on fragment references like #/$defs/.... This is a legitimate pattern in JSON Schema/OpenAPI 3.1, and falsely rejecting these files causes bundling to fail. [ Low confidence ]
  • line 121: Error reporting in _validate_schema_file may raise an unexpected TypeError when constructing the 'Found keys' message. It does ', '.join(sorted(data.keys())), but if the YAML produced non-string keys or a mix of incomparable types, sorted(...) will raise, masking the intended ValidationError and altering control flow to a generic wrapper error. [ Already posted ]
tools/openapi_yaml.py — 0 comments posted, 6 evaluated, 5 filtered
  • line 87: _tokenize does not handle tab characters in indentation. It only strips spaces via raw.lstrip(" ") and computes indent based on spaces, so leading tabs remain in content and indent is calculated as 0. This causes lines like "\t- item" or "\t# comment" to be treated as top-level content rather than indented lines or comments, leading to misparsing or YamlError elsewhere. The function should either reject tabs explicitly or normalize tabs to spaces. [ Low confidence ]
  • line 131: _split_mapping_line only tracks double quotes and escape sequences, ignoring single-quoted strings. As a result, a colon inside a single-quoted key or value (e.g., 'a:b': 1 or key: 'a:b') will be treated as a separator because in_quotes toggling happens only for " characters. This can cause incorrect splitting or YamlError. Given the dumper emits only double-quoted strings, this is tolerable for round-trip of self-generated YAML, but it is a runtime bug when parsing valid external YAML. [ Already posted ]
  • line 158: A list item line consisting of only - with no indented child block is parsed as an empty dict {}. In _parse_block, for content == '-', it calls _parse_block on the next lines with indent + 1; if the next line is not more indented, the child _parse_block returns {} (via mode is None), and {} is appended to items. A dangling - should likely be None (null) or cause a parse error, not {}. [ Already posted ]
  • line 160: Inline mappings in list items are not supported: a line like - key: value is parsed as a scalar string (e.g., 'key: value') rather than as a mapping. For content.startswith("- "), the code does value = _parse_value(content[2:].strip()) and does not attempt to parse a mapping on the same line. This silently produces incorrect data for valid YAML, rather than raising an error or supporting the construct. [ Low confidence ]
  • line 191: load_yaml treats a str argument as raw YAML content instead of a filesystem path. Passing a string path (e.g., './file.yaml') will not read the file; it will parse the path string as YAML, usually causing YamlError. This is a footgun given the signature Path | str. The function should disambiguate, e.g., by accepting PathLike for paths and a separate API to load from a string, or by attempting to read when the string looks like a path. [ Code style ]

cursor[bot]

This comment was marked as outdated.

fragment = fragment.lstrip("/")
for part in fragment.split("/"):
if part:
if isinstance(resolved, dict) and part in resolved:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fragment resolution in _resolve and _resolve doesn’t follow JSON Pointer rules: tokens aren’t unescaped (~1/, ~0~), numeric path segments aren’t supported as list indexes, and empty tokens are skipped. Update _resolve to decode tokens per RFC 6901, treat numeric segments as array indices with bounds checking, and preserve empty keys so valid $ref fragments don’t fail.

-                        if isinstance(resolved, dict) and part in resolved:
-                            resolved = resolved[part]
-                        else:
-                            raise BundleError(f"Fragment '{fragment}' not found in {ref}")
+                        if isinstance(resolved, dict) and part in resolved:
+                            resolved = resolved[part]
+                        elif isinstance(resolved, list):
+                            try:
+                                idx = int(part)
+                            except ValueError:
+                                raise BundleError(f"Fragment '{fragment}' not found in {ref}")
+                            if 0 <= idx < len(resolved):
+                                resolved = resolved[idx]
+                            else:
+                                raise BundleError(f"Fragment '{fragment}' not found in {ref}")
+                        else:
+                            raise BundleError(f"Fragment '{fragment}' not found in {ref}")

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

return True
if value.lower() in {"null", "true", "false"}:
return True
if value and value[0] in "-?:[]{}#&,*!|>'\"%@`":
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numeric-like strings (e.g., "200", "0", "-1", "1.0", "1e3") are not quoted by _needs_quotes. In YAML 1.2, these plain scalars are parsed as numbers, so emitting them unquoted as mapping keys (like OpenAPI status codes) can change their type or break JSON-compatibility expectations.

Consider detecting number-like strings and forcing quotes to preserve string semantics, especially for mapping keys that must remain strings.

+    try:
+        float(value)
+        return True
+    except ValueError:
+        pass

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

if token == "{}":
return {}
if token.startswith('"') and token.endswith('"'):
return json.loads(token)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoted strings are parsed via json.loads without error handling, so invalid escapes raise JSONDecodeError instead of the parser’s YamlError. This breaks the parser’s error contract in both _parse_value and quoted map keys in _parse_block.

Consider catching json.JSONDecodeError at both sites and re-raising as YamlError with a concise message so invalid quoted strings consistently surface as YAML parse errors.

-         return json.loads(token)
+         try:
+             return json.loads(token)
+         except json.JSONDecodeError as e:
+             raise YamlError(f"Invalid quoted string: {e}") from e
-         key = json.loads(key)
+         try:
+             key = json.loads(key)
+         except json.JSONDecodeError as e:
+             raise YamlError(f"Invalid quoted key: {e}") from e

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

if ch == '\\':
escape = True
continue
if ch == '"':
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_split_mapping_line treats colons inside single-quoted scalars as separators because in_quotes only toggles on double quotes. This breaks cases like "'user:name': value" and values like "key: 'http://example.com:8080'".

Consider toggling in_quotes for both single and double quotes so colons within either quoted scalar are ignored.

-        if ch == '"':
-            in_quotes = not in_quotes
-            continue
+        if ch == '"' or ch == "'":
+            in_quotes = not in_quotes
+            continue

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

if mode == "map":
raise YamlError("Cannot mix list and map entries at the same level")
mode = "list"
if content == "-":
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bare list item ('-') without an inline value or nested block is parsed as {} instead of None. In _parse_block, the recursive call returns an empty mapping when there are no indented lines (mode is None), so the list item becomes {}. YAML typically treats a bare item as null, leading to incorrect data types.

Consider detecting the absence of a nested block before recursing: if the next line is not more indented, set the list item value to None instead of recursing into _parse_block.

-            if content == "-":
-                value, index = _parse_block(lines, index, indent + 1)
-            else:
-                value = _parse_value(content[2:].strip())
-            items.append(value)
+            if content == "-":
+                if index < len(lines) and lines[index].indent > indent:
+                    value, index = _parse_block(lines, index, indent + 1)
+                else:
+                    value = None
+            else:
+                value = _parse_value(content[2:].strip())
+            items.append(value)

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

dodeja and others added 2 commits October 22, 2025 00:41
Fixed all 13 Spectral validation errors in OpenAPI specification:
- Added missing schema types to webhook_notifications included array (shipment, container, port, terminal, vessel, metro_area, rail_terminal)
- Converted relative URIs to absolute format (https://api.terminal49.com/v2/...)
- URL-encoded pagination parameters (page[size] → page%5Bsize%5D)
- Added nullable: true to optional relationship fields (shipment.data, location_name)
- Fixed ref_numbers array example (removed null value)
- Added "Shipping Lines" to global tags

Setup automated development workflow:
- Added justfile task runner with comprehensive commands (bundle, lint, watch, preview)
- Configured bun/npm scripts for bundling, linting, and watch mode
- Updated Python bundler to use standard yaml.safe_load (custom parser was too strict)
- Added enhanced validation with detailed error messages
- Setup chokidar file watcher for auto-bundling on YAML changes
- Configured Mintlify with explicit openapi.json reference
- Fixed pre-commit hook line endings (CRLF → LF)

Added comprehensive documentation:
- CLAUDE.md with complete project overview and workflows
- docs/openapi/DEVELOPER_GUIDE.md with detailed development instructions
- Enhanced README.md with modular OpenAPI workflow documentation
- Added inline validation error context and helpful tips

Tooling improvements:
- Added .spectral.yaml linter config (replaced .spectral.mjs)
- Added .redocly.yaml config for Redocly CLI
- Added .watchmanconfig for file watching
- Added bun.lock and package.json with dev dependencies
- Fixed containers-id.yaml indentation for custom parser compatibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Oct 22, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
api Ready Ready Preview Comment Oct 23, 2025 5:36pm

"":
type: "string"
x-stoplight:
id: "kwcjunrtu3r5o"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Empty String Property Name Causes Validation Failures

The OpenAPI schema defines a property using an empty string ("") as its name. This is invalid for OpenAPI and JSON Schema, causing validation errors and potentially breaking tooling that processes the schema.

Fix in Cursor Fix in Web

echo -e "${GREEN}✅ Bundle regenerated successfully${NC}"

# Check if the bundle changed
if git diff --name-only docs/openapi.json | grep -q "openapi.json"; then
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/openapi.json won’t be staged if it’s newly created or previously deleted, because git diff --name-only ignores untracked files. On first bundle generation, the file is created but not added, so the commit misses the artifact and the repo falls out of sync.

Consider updating the staging condition to also detect untracked files (e.g., via git ls-files --error-unmatch), or simply always git add the bundled file after a successful generation so new files are captured.

-        if git diff --name-only docs/openapi.json | grep -q "openapi.json"; then
+        if git diff --name-only -- docs/openapi.json | grep -q "openapi.json" || ! git ls-files --error-unmatch docs/openapi.json >/dev/null 2>&1; then

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

exit 1
}

# Compare the staged version with the freshly bundled version
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verification step checks the working tree docs/openapi.json instead of the staged blob. This can produce false positives/negatives when the index differs from the working copy, even though the hook is triggered by staged changes.

Consider comparing the staged blob by reading :docs/openapi.json into a temp file and diffing that against the freshly bundled output. Also ensure both temp files are cleaned up in both success and failure paths.

+    STAGED_FILE=$(mktemp)
+    git show :docs/openapi.json > "$STAGED_FILE" || {
+        echo -e "${RED}❌ Failed to read staged openapi.json${NC}"
+        rm -f "$TEMP_BUNDLE" "$STAGED_FILE"
+        exit 1
+    }
-    if ! diff -q docs/openapi.json "$TEMP_BUNDLE" > /dev/null 2>&1; then
+    if ! diff -q "$STAGED_FILE" "$TEMP_BUNDLE" > /dev/null 2>&1; then
         echo -e "${RED}❌ openapi.json is out of sync with YAML sources!${NC}"
         echo -e "${YELLOW}Run: python -m tools.openapi_bundle docs/openapi/index.yaml docs/openapi.json${NC}"
-        rm -f "$TEMP_BUNDLE"
+        rm -f "$TEMP_BUNDLE" "$STAGED_FILE"
         exit 1
     fi
 
-    rm -f "$TEMP_BUNDLE"
+    rm -f "$TEMP_BUNDLE" "$STAGED_FILE"

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

echo -e "${YELLOW}🔍 Verifying openapi.json is in sync...${NC}"

# Create a temp file with the freshly bundled version
TEMP_BUNDLE=$(mktemp)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mktemp failure isn’t handled. If mktemp fails, TEMP_BUNDLE is empty and later commands operate on an empty path. This can lead to confusing errors and an unsafe rm -f "$TEMP_BUNDLE".

Consider handling mktemp failure immediately and aborting with a clear message before using TEMP_BUNDLE anywhere.

-    TEMP_BUNDLE=$(mktemp)
+    TEMP_BUNDLE=$(mktemp) || {
+        echo -e "${RED}❌ Failed to create temporary file for verification${NC}"
+        exit 1
+    }

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

f"Schema file appears to be missing expected fields: {path}",
f"",
f"Expected at least one of: {', '.join(sorted(expected_fields))}",
f"Found keys: {', '.join(sorted(data.keys()))}",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

', '.join(sorted(data.keys())) can raise TypeError when schema files have mixed-type or non-string keys, masking the intended ValidationError.

Consider converting keys to strings before sorting and joining so the error message is robust (e.g., use sorted(map(str, data.keys()))).

-            f"Found keys: {', '.join(sorted(data.keys()))}",
+            f"Found keys: {', '.join(sorted(map(str, data.keys())))}",

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

@dodeja dodeja changed the title Document OpenAPI workflow and add CI linting Split OpenAPI.json into digest-able and editable YML files. Including OpenAPI lint and build workflow Oct 22, 2025
Copy link
Member

@mattyturner mattyturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just setup is assuming I already have bun installed.

Ignore Vercel deployment artifacts and Claude Code configuration files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Contributor

@maurycy maurycy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import json

bundled = bundle_openapi(source, validate_schemas=validate_schemas)
destination.write_text(json.dumps(bundled, indent=2, sort_keys=False) + "\n")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml.safe_load can parse timestamp-like scalars (e.g., ISO dates) into datetime/date objects, which will cause json.dumps to fail with TypeError: Object of type date is not JSON serializable during write_bundle.

Consider either preventing implicit timestamp parsing in the YAML loader (e.g., a custom loader that treats timestamps as strings) or making JSON serialization tolerant of such objects by providing a default handler (e.g., default=str) so the bundle always serializes successfully.

-    destination.write_text(json.dumps(bundled, indent=2, sort_keys=False) + "\n")
+    destination.write_text(json.dumps(bundled, indent=2, sort_keys=False, default=str) + "\n")

🚀 Reply to ask Macroscope to explain or update this suggestion.

👍 Helpful? React to give us feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants