json: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximum#7797
json: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximum#7797ochafik merged 30 commits intoggml-org:masterfrom
json: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximum#7797Conversation
{"type": "integer", "minimum": 3}
root ::= ([0-2] [0-9]{1,15} | [3-9] [0-9]{0,15}) spaceThis rule matches strings like |
@p-e-w Oops good catch, thanks! That was one of the "heading zeros allowed" I had on my radar, hadn't realized it was also letting through bad numbers. Fixed. |
* Adding simple bare-bones test for end-to-end integration test for json validation against auto-generated JSON-schema grammars. * Adding additional examples as documented in #7789 . Also adding the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program. * Uncommenting formerly commented tests so that they fail for others who are attempting to reproduce the bugs. * Merging improved schema test methods added by @ochafik in #7797 * Adding #define to temporarily remove failing tests so that this PR can pass CI, but still be useful for other PRs that want to leverage the framework. * Fixing nits from ochafik. Removing escape slashes, adding additional failing cases, fixing some other strings. * Fixing grammar indentation to be consistent throughout file.
json: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximumjson: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximum
tests/test-grammar-integration.cpp
Outdated
| ); | ||
|
|
||
| test_schema( | ||
| "min 0", |
There was a problem hiding this comment.
This schema looks identical with the "simple min 0" test from line 154. Can they be merged together to shorten this file?
tests/test-grammar-integration.cpp
Outdated
| ); | ||
|
|
||
| test_schema( | ||
| "min -123 max 42", |
There was a problem hiding this comment.
This schema looks identical to the one from the "simple min -123 max 42" test on line 249. Can they be merged together?
There was a problem hiding this comment.
Oops, reshuffled / merged them all, thanks!
common/json-schema-to-grammar.cpp
Outdated
| static std::string repeat(const std::string & str, size_t n); | ||
|
|
||
| static std::string build_repetition(const std::string & item_rule, int min_items, int max_items, const std::string & separator_rule = "") { | ||
| static std::string build_repetition(const std::string & item_rule, const int min_items, int max_items, const std::string & separator_rule = "") { |
There was a problem hiding this comment.
If you're changing min_items to const, is it worth adding const to max_items as well?
There was a problem hiding this comment.
Dropped unintended change, thanks!
tests/test-grammar-integration.cpp
Outdated
| #include <vector> | ||
| #include <json-schema-to-grammar.h> | ||
|
|
||
| using json = nlohmann::ordered_json; |
There was a problem hiding this comment.
This might be a fault of the way I merged the code, but we don't need these 3 lines because json-schema-to-grammar.h is already included, and this using line is already here.
There was a problem hiding this comment.
Oops sorry bad merge, fixed
…aviour of libstdc++ vs. clang's libc++, can't read final NULL char w/ former)
HanClinto
left a comment
There was a problem hiding this comment.
Nice defensive programming!
|
Very excited to see these changes getting merged in!! :) 🚀 🚀 🚀 |
…Maximum (ggml-org#7797) * json: support minimum for positive integer values * json: fix min 0 * json: min + max integer constraints * json: handle negative min / max integer bounds * json: fix missing paren min/max bug * json: proper paren fix * json: integration test for schemas * json: fix bounds tests * Update json-schema-to-grammar.cpp * json: fix negative max * json: fix negative min (w/ more than 1 digit) * Update test-grammar-integration.cpp * json: nit: move string rules together * json: port min/max integer support to Python & JS * nit: move + rename _build_min_max_int * fix min in [1, 9] * Update test-grammar-integration.cpp * add C++11-compatible replacement for std::string_view * add min/max constrained int field to pydantic json schema example * fix merge * json: add integration tests for min/max bounds * reshuffle/merge min/max integ test cases * nits / cleanups * defensive code against string out of bounds (apparently different behaviour of libstdc++ vs. clang's libc++, can't read final NULL char w/ former)
Building upon #6640, I went on a wild goose chase wondering whether one could support numeric bounds with grammars, and I'm relatively happy with the provisional outcome.
This is still a WIP, here are some TODOs:
Possible follow ups:
/cc @HanClinto this since you mentioned it in #7789 :-D