stdlib: json, expose some internals for jsonStringify users#20962
stdlib: json, expose some internals for jsonStringify users#20962Arwalk wants to merge 1 commit intoziglang:masterfrom
Conversation
dff2674 to
d0e8171
Compare
|
Are you able to use |
|
Sorry, i wasnt explicit enough in the goal: encoding directly to the json without unnecessary allocations. Print would have worked, but it still would have required me to encode to base64 then use print to insert it in the stream. As the whole point of base64 is to encode binary data that can be quite large, this would have effectively doubled the memory footprint here. |
|
Ah I see. You want to stream large amounts of data into the write stream as a single json value. So then exposing the begin and end methods seem to make sense. Exposing these functions is more effort than just putting |
|
If you make a rough plan i'll be happy to participate a bit. Thanks for everything. |
|
I thought about it a little while, and there are not a lot of cases where you need this, only if you're encoding a very long string. Instead of opening up the internals like this, we could try a few other things (that are not mutually exclusive). Providing a new
|
Thanks for the offer! However, I'm not sure that collaboration makes sense for this. I don't want to turn away willing volunteers, but I hope the following should explain why I'm not optimistic about the usefulness of collaboration. The rough plan is a bit hard to explain, but it's something like this: expose either the existing functions, as in this PR, or make dedicated public endpoints that also track a debug-mode-only boolean for whether the stream is in raw streaming write mode. then either create a method that accepts a byte slice and writes directly to the output stream, or just explain to the caller that they can write directly to the internal field while in the special mode. then consider whether the method version of the previous sentence would be better than the field version in order to add an assertion using that debug-only boolean. then consider how many other methods should have assertions for that debug-only boolean to make sure we're entering and exiting the raw write mode correctly before using any other methods. then document everything and write tests for it all. all the while reconsidering the whole approach and thinking of other potential solutions. and one more thing to work out is who's responsible for writing the beginning and ending quotes to the string, and i'm leaning toward the caller needing to include that in what they write, because that also allows this mechanism to be used to write arbitrarily long numbers (which is allowed in JSON). then there's the consideration that this mechanism could be used to write arbitrary preformatted json, which seems like a good feature rather than a red flag i think. whatever the outcome of this paragraph should also be documented. so articulating the process suggests that this is a lot of ... design work? it's not just going through a checklist of certain tasks, which makes this less suitable for collaboration. I've been working on finally cleaning up my json-diagnostics branch before working on this PR. so it's taking me more than a few days to get to it, but it is still on my queue to get to eventually. |
I'd like to keep the callgraph pointing from application code calling into the stdlib code as much as possible, rather than the library calling an application function. i know this already isn't the case for the methods to customize serialization and deserialization, but i'd still prefer to keep that to a minimum (there are already problems with the existing custom functions such as #16891 .). Some advantages of keeping the application in control of initiating writing the next chunk of the stream rather than a loop in the stdlib code is that it's trivial to delay or cancel for whatever reason. I don't think we need to define any objects or interfaces to make this feature work, and I'd prefer not to. I know that everyone has a different opinion about what seems "simpler" and "easier to maintain", and i certainly have my opinion on it. Adding a bunch of assertions into every public method seems simpler and easier to maintain to me than creating a loop that calls an application function. I can try to explain why, but i'm not going to try to convince anyone to change their opinion. I do believe that my approach results in smaller generated code (in releasefast) and a less tangled call graph (which i don't know how to define rigorously, but i expect that there's some way). |
|
I see your points. Of course it's not an easy subject. #16891 is interesting as we fall in this exact case in zig-protobuf too: the specification says that an implementation should have some options in their encoding methods (such as writing enums with their int value instead of their name) and decoding methods (such as ignoring unknown keys/fields). Right now, we're not sure how to handle this, probably by encapsulating our initial structure inside another structure that has the original data + context and having jsonParse/jsonStringify there. I heard you on the collaboration aspect, and agree that it is certainly a case where a single person on the subject will go faster, but my offer still stands, at least to give you feedback on your work once you have a branch ready. Don't hesitate. Should we close this PR? And of course, thank you for your time. |
No problem! of course some actual productive time would be better than just sketching ideas in a discussion, but hopefully that will come soon. |
|
See #21155 |
the json library allows custom representations if an object possesses the
jsonStringifyinterface.Some of the internals of the json library are exposed for users of this functionality, such as
beginObject,beginArray... But nothing is available when the goal is to push a raw value (such as a string) directly into the stream.This commit makes public
valueStartandvalueDoneto avoid people having to reimplement by themselves (and follow the developments) of some of the json lib.I was specifically in this case on https://github.com/Arwalk/zig-protobuf when trying to implement json parsing/encoding : bytes field are supposed to be base64 encoded, meaning i had to reimplement
valueStartandvalueDonejust to be able to make my encoding.