-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[JS] FlexBuffers Support #5973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JS] FlexBuffers Support #5973
Conversation
…lder Dart version
[] operator throws a very descriptive exception in case of a bad key.
aardappel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this! Have a few comments though :)
| }; | ||
|
|
||
| flexbuffers.builder = (size = 2048) => { | ||
| let buffer = new ArrayBuffer(size > 0 ? size : 2048); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just let the ArrayBuffer constructor throw something if someone passes a negative size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh there is another purpose in this expression. The grow buffer algorithm will not work properly if the ArrayBuffer size is 0. So I would rather keep it here then checking each time I need to grow the buffer. It is also called just once per buffer creation so IMHO no big performance penalty.
| if (finished) { | ||
| throw "Adding values after finish is prohibited"; | ||
| } | ||
| if (stackPointers.length !== 0 && stackPointers[stackPointers.length - 1].isVector === false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not do this check as a map is being serialized, which is much cheaper. See C++ implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because IMHO it is better to keep cause and effect close together. This way the exception is thrown when I add the value so I will know how to fix it instantly. If the exception is thrown when the map is being serialized, I know I did something wrong, but I have no idea what exactly so I will have to manually go through my builder code and search for one or multiple bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for errors that are not likely to be frequent, I'd rather have a focus on performance.
js/flexbuffers.js
Outdated
| return; | ||
| } | ||
| if (cache && indirectIntCache.hasOwnProperty(value)) { | ||
| stack.push(indirectIntCache[value]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a cache for indirect int values? why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's based on following API:
addInt: function(value, indirect = false, cache = false)
If I want to add an int indirectly, I can also cache it so if I add another int with the same value it will be reused. I think there is similar strategy in C++ but there user need to store the offset and provide it. It is faster but not as safe as users might provide a wrong offset. As JS is in general a language where it is easier to make mistakes ;) I chose to provide an API which provides smaller surface for errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but these lookups are expensive and potentially use a lot of memory. I'd say they should be optional, but in the case of indirect ints I think not having them at all is better.
The FlatBuffers JS API also doesn't have any check for wrong offsets, it is a very high cost to pay to try and make this foolproof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agree, this is also way the deduplication strategy is set do false per default. To be honest I don't thin that many users will use a builder in JS, I assume >90% users will just call encode with a dict or array.
|
@krojew do you mind giving this a quick look? |
|
I would very much like to see it implemented in TS, rather than JS. This way we would have safer code, TS implementation covered and can compile to JS. |
| const byteWidth = 1 << (packedType & 3); | ||
| const valueType = packedType >> 2; | ||
| let length = -1; | ||
| return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better done as a proper class to avoid constructing new objects and closures on each construction.
| return offsetStackValue(vecOffset, flexbuffers.ValueType.VECTOR, bitWidth); | ||
| } | ||
|
|
||
| function StackValue(type, width, value, _offset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as in other places, prefer real classes instead.
js/flexbuffers.js
Outdated
| flexbuffers.BitWidth.width = (value) => { | ||
| if (Number.isInteger(value)) { | ||
| const v = value < 0 ? value * -1 : value; | ||
| if (v >> 7 === 0) return flexbuffers.BitWidth.WIDTH8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments from others in my team (not me, I have no idea):
"That's just incorrect, e.g. for v = 0x100000001 which is not an 8-bit integer but would take the return here."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ version:
flatbuffers/include/flatbuffers/flexbuffers.h
Lines 183 to 193 in 9abb2ec
| inline BitWidth WidthU(uint64_t u) { | |
| #define FLATBUFFERS_GET_FIELD_BIT_WIDTH(value, width) \ | |
| { \ | |
| if (!((u) & ~((1ULL << (width)) - 1ULL))) return BIT_WIDTH_##width; \ | |
| } | |
| FLATBUFFERS_GET_FIELD_BIT_WIDTH(u, 8); | |
| FLATBUFFERS_GET_FIELD_BIT_WIDTH(u, 16); | |
| FLATBUFFERS_GET_FIELD_BIT_WIDTH(u, 32); | |
| #undef FLATBUFFERS_GET_FIELD_BIT_WIDTH | |
| return BIT_WIDTH_64; | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why I felt a macro was necessary.. apologies :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well Number type is really broken in JavaScript. The C++ way you described, does not work either ;).
Good news, there is BigInt type, which works in a sane way. Bad news it is currently not supported by all browsers. Most notable Safari, on desktop and on iOS. But the version of macOS and iOS which are introduced yesterday and will be rolled out this fall include Safari which supports BigInt. Generally we need BigInt to to build a buffer, reading will work fine until we need to access a value which is stored as an 8 byte int / uint. If some one have a great idea for fixing it. You have my attention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bigint is also going to be slow, as it will be implemented as a dynamically allocated object in many cases.
I have no idea how to solve these kinds of problems in JS. From what I hear, to do bit-twiddling, you can basically rely on that working for 32-bit integers.. any bigger integers and you'd have to split it up manually. That's what FlatBuffers does for 64-bit numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed BigInt / BigUInt conversions when checking for int / unit width and when reading int / uint it is used only if supported by the platform.
I was thinking about TypeScript as well. But the request in #5949 was explicitly for JavaScript. And the infrastructure code for FlatBuffers is also written in JavaScript, even though there is an option in flatc to generate TypeScript explicitly. So bottom line, I guess I could port the JS code to TS. But only if people agree that it is really worth it. |
Given you get better maintainability with safer code and cover two implementations at once - I would say it's worth it :) |
…and ValueTypeUtil accordingly. Removing defensive checks. Introducing fix for large int numbers by converting them to BigInt type. Introducing handling for BigInt type in `add` method. Using TextEncoder and Decoder to handle string / utf8 conversion.
…of keys off while building FlexBuffer. Implements quick sort and choses quick sort if the number of keys is bigger then 20. Removes unnecessary dict lookups in BitWidthUtil helper functions
|
Sorry for the radio silence. I did a bit of refactoring to address some of the feedback. I don't think that I will take the time and write a TypeScript port, given that the user facing API is very slim. There was also a suggestion to use JS classes and not just objects and functions. Also for a real performance critical scenarios like creation of buffer on Node.js I would rather recommend to have a WebAssembly (Rust, C++ bridge). I see this implementation as a simple single file drop in solution. |
|
@aardappel could you please point me to issues which need more attention from me. |
|
@mzaks well, I don't really know what "good JS" looks like, so that is why I am hoping you've taken into account what the above reviewers say, who know much better than me. As far as the actual code structure is concerned, I would want it to match the C++ implementation as closely as possible, as I know that has a) gone through a lot of testing, and b) encodes the intended semantics of the serialization format and API as closely as possible. But I cannot go line by line and compare this PR with the C++ code, that is something you'll have to do. If you're confident you have done that, we can merge, if there's not any other reviewers. Maybe @CasperN @paulovap @dmitriykovalev who all worked on FlexBuffers implementations recently want to have a quick look. |
@aardappel I am confident that this port is functional and is interoperable with C++ produced binary, based on the unit tests I wrote. One of them is also based on the golden FlexBuffer, which was produced during Rust FlexBuffer development. In regards to "good JS" I have to be honest and say that JS was never my "focus" language. I use it for over a decade but very sporadically. So there are probably thing which can be improved upon and I am happy to review the changes, but IMHO current implementation and unit tests are a good functioning starting point. As I mentioned in a previous comment, for a heavy use case I would anyways recommend to have a WebAssembly version, which can be derived from C++, Rust, or I could write a Lobster implementation just for fun of it 😀. |
I've definitely thought of that, but since Lobster already incorporates some of the C++ FlatBuffers code, it be most attractive to make it use the C++ FlexBuffer code as well. |
|
@aardappel Seems to me you don't have any issues with the code but still blocking it? Why not invite some of the others as reviewers that you mentioned to look into it so they can put block instead? |
|
@normano I am not blocking anything. I already invited them, and was simply waiting to see if any would comment. I guess they don't, so lets merge this as-is. |
* Adding FlexBuffers support for Dart language * Introduce snapshot method. * Fix docu * Replacing extension methods with static methods in order to support older Dart version * Improving code based on PR feedback. Mainly rename refactoring. * Addressing all PR feedback which does not need clarification * exchange dynamic type with Object * Adds better API documentation. [] operator throws a very descriptive exception in case of a bad key. * Implementation of JavaScript FlexBuffers decoder * implements JS FlexBuffers builder * replacing _toF32 with Math.fround * Introducing test for BigInt number * Moving functions from BitWitdth & ValueType object into BitWidthUtil and ValueTypeUtil accordingly. Removing defensive checks. Introducing fix for large int numbers by converting them to BigInt type. Introducing handling for BigInt type in `add` method. Using TextEncoder and Decoder to handle string / utf8 conversion. * rename variable * Lets user turn deduplication strategies for strings, keys and vector of keys off while building FlexBuffer. Implements quick sort and choses quick sort if the number of keys is bigger then 20. Removes unnecessary dict lookups in BitWidthUtil helper functions * make iwidth and uwidth computation simpler and faster * Making redInt and readUint a bit faster and shim the BigInt / BigUint usage
This PR introduces FlexBuffers support for JavaScript.