When removeNullBytes is set, length calculations did not take into account null bytes.#17232
Conversation
…ccount null bytes.
There was a problem hiding this comment.
I think it would be cleaner if the tests are added in the following way:
- Remove the inherited file
StringFieldWritersWithNullHandlingTest.javasince it's not checking stuff with "null-handling". We are checking null-bytes, which is completely different than what Druid associates the term "null-handling" for. - In the
setUpofStringFieldWriterTest,there are a couple of field writers. There should be two more -fieldWriterWithRemoveNullBytes,fieldWriterUtf8WithRemoveNullBytes. - Replicate the
doTestmethod, withdoTestWithRemoveNullBytes. Call that method along withdoTestin the test cases that are present already. - Add a new test case that tests a String like "abc\u0000". I am surprised that you are able to replicate the behavior with the added test case since that is not what the patch is for.
doTestshould fail with this test case, because of null bytes in the string (which is not happening in this PR), and we should have an assertion for that error. ThedoTestWithRemoveNullByteswill pass. - Moreover, I think the memory is initialized with null bytes , so you'd have to fill up the memory with junk bytes that are not null. I am not sure if this is strictly required. We can check this, but I don't see any downside of having some randomly generated garbage data (without null) in the memory where the field writer is writing that will ensure that the test fails regardless of the initialization conditions.
What do you think of the above? Also, I have a couple of questions:
- Have I missed any reason for having a new class overriding the before method for the testcases and making the variables protected?
- Are you running into any errors with the test case that has been added, without the changes. I expect the test case to run fine because that is not what the patch is intended for, and it is certainly worth looking into if it is failing without these changes.
| doTest(Arrays.asList("foo", NullHandling.emptyToNullIfNeeded(""), "bar", null)); | ||
| } | ||
|
|
||
| @Test |
There was a problem hiding this comment.
This name is misleading because it tests "nulls" instead of "null bytes". The string should be like "foo\u0000" in order for it to have the null byte. Does this test fail without the changes to the StringFieldWriter?
There was a problem hiding this comment.
Just pushed up a new patch.
There was a problem hiding this comment.
I do not think a new file is required for the test case. This should be a helper method like doTest in the StringFieldWriterTest.
This will allow the reader to visualize the different test cases together. Both the helper methods would exhibit the same behavior except for strings where null bytes are present, where the doTest would throw and the newer one would silently eat up the null bytes.
|
I have added a new test case and machinery to test individual field writers. The new test case should be hitting the new code changes now. |
…count null bytes. (apache#17232) * When replaceNullBytes is set, length calculations did not take into account null bytes.
Fix for errros like :
Patch was prepared with help from @LakshSingla .
The test cases added here does not completely check the new code flow.