Skip to content

Comments

Jtd test suite inclusion#140

Merged
simbo1905 merged 1 commit intojtd-genfrom
cursor/jtd-test-suite-inclusion-fe61
Feb 8, 2026
Merged

Jtd test suite inclusion#140
simbo1905 merged 1 commit intojtd-genfrom
cursor/jtd-test-suite-inclusion-fe61

Conversation

@simbo1905
Copy link
Owner


What changed

  • Removed two large jtd-spec-validation.json files (totaling 9,390 lines) from the repository.
  • Introduced a shared JtdTestDataExtractor utility in both json-java21-jtd and json-java21-jtd-codegen modules.
  • Updated JtdSpecConformanceTest, CodegenSpecConformanceTest, JtdSpecIT, and CompilerSpecIT to use the JtdTestDataExtractor to extract test data from jtd-test-suite.zip at runtime.
  • The extraction is cached in target/test-data/ to prevent redundant operations within a single build.

Why this change is needed

  • The original PR had a significantly inflated line count (over 9,000 lines) due to the direct commitment of the JTD specification test suite JSON files.
  • Committing large test data files directly is undesirable as it bloats the repository size and makes PRs difficult to review.
  • This change ensures that the test data is pulled and extracted on demand, keeping the repository clean and focused on code and documentation.

How were these changes tested

  • All existing unit and integration tests in the json-java21-jtd module were run (452 tests).
  • All tests passed, confirming that the new test data extraction mechanism functions correctly and does not introduce regressions.

Checklist

  • Code builds / passes tests
  • New tests added if needed (Existing tests were adapted to the new data loading mechanism)
  • Update to use CODING_STYLE_LLM.md convensions
  • Documentation updated if needed
  • AGENTS.md updated if appropriate

Augmented Intelligence Review:
This PR was prepared by an AI assistant based on the user's initial prompt to address the large file size in the original PR.


Open in Cursor Open in Web

Previously, two copies of the 78KB jtd-spec-validation.json file were
committed to the repository (156KB total), bloating the PR and git history.

Changes:
- Created JtdTestDataExtractor utility class to extract test data from
  existing jtd-test-suite.zip at test runtime
- Updated JtdSpecConformanceTest and CodegenSpecConformanceTest to use
  extraction instead of classpath resources
- Updated JtdSpecIT and CompilerSpecIT to use shared extractor
- Deleted committed JSON files from both modules
- Codegen module references parent module's ZIP file

Testing:
- Run: ./mvnw -pl json-java21-jtd test
- All 452 tests pass (136 unit + 316 spec conformance)
- Test data is automatically extracted from ZIP on first run
- Reduces PR size by ~156KB (9,390 lines)

Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
@cursor
Copy link

cursor bot commented Feb 8, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@gemini-code-assist
Copy link

Summary of Changes

Hello @simbo1905, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors how the JTD specification test suite is managed within the project. By removing large, directly committed JSON test files and introducing a shared utility to extract them on demand from a ZIP archive, the change aims to significantly reduce repository bloat and streamline the review process for future contributions. This approach ensures that test data is always available when needed without being a permanent part of the source control.

Highlights

  • Repository Size Reduction: Removed two large jtd-spec-validation.json files (totaling 9,390 lines) from the repository, significantly reducing its size and improving PR reviewability.
  • Test Data Management: Introduced a new shared utility, JtdTestDataExtractor, in both json-java21-jtd and json-java21-jtd-codegen modules to handle on-demand extraction of the JTD specification test suite from a ZIP archive.
  • Test Suite Integration: Updated JtdSpecConformanceTest, CodegenSpecConformanceTest, JtdSpecIT, and CompilerSpecIT to utilize the JtdTestDataExtractor for dynamic loading of test data from jtd-test-suite.zip at runtime.
  • Performance Optimization: Implemented a caching mechanism for the extracted test data in target/test-data/ to prevent redundant extraction operations within a single build cycle.
Changelog
  • json-java21-jtd-codegen/src/test/java/json/java21/jtd/codegen/CodegenSpecConformanceTest.java
    • Modified the cases() method to load test data using JtdTestDataExtractor.getValidationTestDataStream() instead of directly reading jtd-spec-validation.json from the classpath.
  • json-java21-jtd-codegen/src/test/java/json/java21/jtd/codegen/JtdTestDataExtractor.java
    • Added a new utility class responsible for extracting the jtd-test-suite.zip into target/test-data if not already present. It provides methods to ensure extraction and get an InputStream for validation.json.
  • json-java21-jtd/src/test/java/json/java21/jtd/CompilerSpecIT.java
    • Refactored the extractTestData() method to delegate the extraction logic to JtdTestDataExtractor.ensureValidationTestData(), removing duplicated extraction code.
  • json-java21-jtd/src/test/java/json/java21/jtd/JtdSpecConformanceTest.java
    • Modified the cases() method to load test data using JtdTestDataExtractor.getValidationTestDataStream() instead of directly reading jtd-spec-validation.json from the classpath.
  • json-java21-jtd/src/test/java/json/java21/jtd/JtdSpecIT.java
    • Refactored the extractTestData() method to delegate the extraction logic to JtdTestDataExtractor.ensureValidationTestData(), removing duplicated extraction code.
  • json-java21-jtd/src/test/java/json/java21/jtd/JtdTestDataExtractor.java
    • Added a new utility class (similar to the codegen module's extractor) responsible for extracting the jtd-test-suite.zip into target/test-data if not already present. It provides methods to ensure extraction and get an InputStream for validation.json.
Activity
  • The pull request was initiated to address repository bloat caused by large committed test data files.
  • All 452 existing unit and integration tests in the json-java21-jtd module were executed and passed, confirming no regressions were introduced.
  • This pull request was prepared by an AI assistant based on the user's initial prompt.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully removes large test data files from the repository by extracting them from a zip archive at test time. This is a great improvement for repository health. The introduction of a JtdTestDataExtractor utility centralizes the extraction logic. However, this utility class has been duplicated across two modules, which should be addressed to improve maintainability. Additionally, the implementation of the extractor relies on a hardcoded, versioned path to the test data file, which is brittle and may cause issues when the test suite is updated. My review includes suggestions to resolve these issues by sharing the test utility code between modules and making the file path resolution more dynamic.


/// Shared utility for extracting the JTD test suite from the embedded ZIP file.
/// Used by both unit tests and integration tests to avoid committing large JSON files.
final class JtdTestDataExtractor {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is significant code duplication. This class is nearly identical to json-java21-jtd/src/test/java/json/java21/jtd/JtdTestDataExtractor.java. Duplicating utility code across modules makes maintenance harder.

Consider creating a shared test utility module or using Maven's test-jar feature to share this code from one module to the other. This would ensure that any future changes to the extraction logic only need to be made in one place.

This would also solve the problem of the fragile relative path used to locate the ZIP file (line 19), as the test resource could be loaded from the classpath via getResourceAsStream.

// Reference the ZIP file from the sibling json-java21-jtd module
private static final Path ZIP_FILE = Paths.get("../json-java21-jtd/src/test/resources/jtd-test-suite.zip");
private static final Path TARGET_DIR = Paths.get("target/test-data");
private static final Path VALIDATION_FILE = TARGET_DIR.resolve("json-typedef-spec-2025-09-27/tests/validation.json");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The path to validation.json includes a hardcoded version string (json-typedef-spec-2025-09-27). This is brittle and will break if the test suite is updated to a new version. The extraction logic should dynamically locate the validation.json file within the extracted contents instead of relying on a hardcoded path. For example, you could scan the extracted directory for tests/validation.json after extraction.

private static final Logger LOG = Logger.getLogger("json.java21.jtd");
private static final Path ZIP_FILE = Paths.get("src/test/resources/jtd-test-suite.zip");
private static final Path TARGET_DIR = Paths.get("target/test-data");
private static final Path VALIDATION_FILE = TARGET_DIR.resolve("json-typedef-spec-2025-09-27/tests/validation.json");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The path to validation.json includes a hardcoded version string (json-typedef-spec-2025-09-27). This is brittle and will break if the test suite is updated to a new version. The extraction logic should dynamically locate the validation.json file within the extracted contents instead of relying on a hardcoded path. For example, you could scan the extracted directory for tests/validation.json after extraction.

@simbo1905 simbo1905 marked this pull request as ready for review February 8, 2026 21:04
@simbo1905 simbo1905 merged commit 407129c into jtd-gen Feb 8, 2026
0 of 2 checks passed
@simbo1905 simbo1905 deleted the cursor/jtd-test-suite-inclusion-fe61 branch February 8, 2026 21:04
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5dc08e42d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +18 to +20
// Reference the ZIP file from the sibling json-java21-jtd module
private static final Path ZIP_FILE = Paths.get("../json-java21-jtd/src/test/resources/jtd-test-suite.zip");
private static final Path TARGET_DIR = Paths.get("target/test-data");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid hard dependency on sibling module test data

The codegen test suite now hard-codes the ZIP path to ../json-java21-jtd/src/test/resources/jtd-test-suite.zip. If json-java21-jtd-codegen is built or tested in isolation (for example, from a source release or checkout that does not include the sibling module), this will throw JTD test suite ZIP not found and the conformance tests will fail immediately. Previously the module had its own jtd-spec-validation.json, so this is a regression in test portability. Consider keeping the ZIP in this module’s test resources or loading it from the classpath to keep tests self-contained.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants