Initial impl of BytecodeIonReader#1136
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## ion11 #1136 +/- ##
========================================
Coverage ? 69.20%
Complexity ? 6298
========================================
Files ? 202
Lines ? 25486
Branches ? 4418
========================================
Hits ? 17637
Misses ? 6474
Partials ? 1375 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
austnwil
left a comment
There was a problem hiding this comment.
This all looks pretty solid to me. Nice!
| private set | ||
|
|
||
| private val context = EncodingContextManager() | ||
| private var symbolTable: Array<String?> = EncodingContextManager.SYSTEM_SYMBOLS |
There was a problem hiding this comment.
Why have a symbol table and constant pool both here and in EncodingContextManager? I assume the symbol table here is planned to be removed and the one in the context manager used instead?
There was a problem hiding this comment.
I'm still figuring some of that out, but I think that the symbol table and constant pool in the EncodingContextManager will be a class that wraps an array and manages resizing the array, etc.
Whereas the BytecodeIonReader will be granted "unsafe" access to the underlying array in order to reduce overhead in the hot code paths.
| private var isInStruct = false | ||
| private var isNextAlreadyLoaded = false | ||
|
|
||
| private val containerStack = _Private_RecyclingStack(10) { ContainerInfo() } |
There was a problem hiding this comment.
Just noting here that the recycling data structures were removed on main in favor of managing ContainerInfo arrays directly to improve performance for deeply nested containers.
There was a problem hiding this comment.
I'm aware. Once we're further along, we can run tests to see if we get similar results here. I'm not surprised that it made a difference in #1101, but another consideration is the object size. I'm trying to keep this class small (64 bytes) so that it can fit in a single cache line. Once it's complete, we can experiment with adding more members to this class to see if the performance improvement from reducing overhead outweighs any potential performance penalty from having a larger object size.
| private var instruction = 0 | ||
| private var fieldNameIndex = -1 | ||
| private var annotationsIndex = -1 | ||
| private var annotationCount: Byte = 0 |
There was a problem hiding this comment.
Curious why is this a byte? toByte() is called when assigning it and toInt() is called every time it is used.
There was a problem hiding this comment.
I'm hoping to keep the memory footprint of the class no more than 64 bytes so that it can fit in a single cache line. This is an easy way to help keep it smaller.
| * Generators may be configured to expose some or all macro invocations. | ||
| * Generators should expose macro invocations using the `INVOKE` instruction | ||
| * The `INVOKE` instruction is followed by one argument for each parameter in the signature (making any "no argument" expressions explicit). | ||
| * The arguments are followed by the `END_CONTAINER` instruction. (TODO: Should we have a distinct `END_INVOKE` instruction?) |
There was a problem hiding this comment.
I think it makes sense to have a separate END_INVOKE if you already have a separate END_TEMPLATE, but I'm pretty indifferent.
There was a problem hiding this comment.
I agree, but I've left a TODO because I think we can defer this until we are exposing un-evaluated macro invocations.
| } | ||
|
|
||
| @Test | ||
| fun `after reading I_END_OF_INPUT, it should be possible to check for more input`() { |
There was a problem hiding this comment.
This is not related to this PR, but I was pretty surprised to see that this is supported functionality. This TODO exists in the bytecode reference doc, and if it is possible to read more data after END_OF_INPUT then I would be inclined to rename it to this eventually, since I don't think END_OF_INPUT communicates this well.
* OR rename `END_OF_INPUT` to `NEEDS_DATA`?
There was a problem hiding this comment.
Sure, we can rename this in another PR.
| override fun bigIntegerValue(): BigInteger { | ||
| val instruction = this.instruction | ||
| val op = Instructions.toOperation(instruction) | ||
| var i = bytecodeI | ||
| val bigInt = when (op) { | ||
| Operation.OP_INT_I16 -> Instructions.getData(instruction).toBigInteger() | ||
| Operation.OP_INT_I32 -> bytecode[i].toLong().toBigInteger() | ||
| Operation.OP_INT_I64 -> { | ||
| val msb = bytecode[i++].toLong() and 0xFFFFFFFF | ||
| val lsb = bytecode[i].toLong() and 0xFFFFFFFF | ||
| ((msb shl 32) or lsb).toBigInteger() | ||
| } | ||
| Operation.OP_INT_CP -> { | ||
| val cpIndex = Instructions.getData(instruction) | ||
| constantPool[cpIndex] as BigInteger | ||
| } | ||
| Operation.OP_INT_REF -> { | ||
| val length = Instructions.getData(instruction) | ||
| val position = bytecode[bytecodeI] | ||
| generator.readBigIntegerReference(position, length) | ||
| } |
There was a problem hiding this comment.
I know we haven't landed on the exact null behavior for these yet, but I feel like it is reasonable to at least support null.int here. According to your research all the current implementations exhibit this behavior (same for null.float)
You can ignore this though if you want to wait for a complete solution to these null questions.
| override fun bigIntegerValue(): BigInteger { | |
| val instruction = this.instruction | |
| val op = Instructions.toOperation(instruction) | |
| var i = bytecodeI | |
| val bigInt = when (op) { | |
| Operation.OP_INT_I16 -> Instructions.getData(instruction).toBigInteger() | |
| Operation.OP_INT_I32 -> bytecode[i].toLong().toBigInteger() | |
| Operation.OP_INT_I64 -> { | |
| val msb = bytecode[i++].toLong() and 0xFFFFFFFF | |
| val lsb = bytecode[i].toLong() and 0xFFFFFFFF | |
| ((msb shl 32) or lsb).toBigInteger() | |
| } | |
| Operation.OP_INT_CP -> { | |
| val cpIndex = Instructions.getData(instruction) | |
| constantPool[cpIndex] as BigInteger | |
| } | |
| Operation.OP_INT_REF -> { | |
| val length = Instructions.getData(instruction) | |
| val position = bytecode[bytecodeI] | |
| generator.readBigIntegerReference(position, length) | |
| } | |
| override fun bigIntegerValue(): BigInteger? { | |
| val instruction = this.instruction | |
| val op = Instructions.toOperation(instruction) | |
| var i = bytecodeI | |
| val bigInt = when (op) { | |
| Operation.OP_INT_I16 -> Instructions.getData(instruction).toBigInteger() | |
| Operation.OP_INT_I32 -> bytecode[i].toLong().toBigInteger() | |
| Operation.OP_INT_I64 -> { | |
| val msb = bytecode[i++].toLong() and 0xFFFFFFFF | |
| val lsb = bytecode[i].toLong() and 0xFFFFFFFF | |
| ((msb shl 32) or lsb).toBigInteger() | |
| } | |
| Operation.OP_INT_CP -> { | |
| val cpIndex = Instructions.getData(instruction) | |
| constantPool[cpIndex] as BigInteger | |
| } | |
| Operation.OP_INT_REF -> { | |
| val length = Instructions.getData(instruction) | |
| val position = bytecode[bytecodeI] | |
| generator.readBigIntegerReference(position, length) | |
| } | |
| Operation.OP_NULL_INT -> null |
There was a problem hiding this comment.
I'm going to leave nulls for later.
Co-authored-by: Austin Williams <austnwil@amazon.com>
Issue #, if available:
None
Description of changes:
Starts implementation of
BytecodeIonReaderEncodingContextManagerinstruction_reference.mdwith some more information about integers and unevaluated macro invocations, and cleans up some formatting.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.