Fixed the non deterministic test testSerdeWithInputFormat#17325
Fixed the non deterministic test testSerdeWithInputFormat#17325yugvajani wants to merge 1 commit intoapache:masterfrom
Conversation
| KafkaSupervisorSpec spec2 = mapper.readValue(serialized, KafkaSupervisorSpec.class); | ||
|
|
||
| String stable = mapper.writeValueAsString(spec2); | ||
| String sortedStableJson = sortJsonString(stable); |
There was a problem hiding this comment.
note: this seems like an unconditional resort at the Json level for only this test; but I wonder if it only affects this
from the repro steps: nondex have detected that the output here is unstable.
I don't know the Kafka part; but there might be lists/arrays for which order does matter - sorting all parts of a json unconditionally may possible hide issues.
I wonder if it was investigated what causes these mismatches. I suspect that some underlying class may use some unstable constructs - in most cases these used to boil down to a HashSet -> LinkedHashSet change...
Another possible way to fix this in a wider scope is to make the KafkaSupervisorSpec comparable and do the validation at that level; so that the real classes can decide if 2 maps/sets are equal or not ; and lists / arrays will still be validated correctly
what do you think about these approaches?
There was a problem hiding this comment.
If an underlying class were using HashSet, other tests in the same file would likely have failed as well, since they all use the same underlying class. The key difference with testSerdeWithInputFormat is that it wasn’t using a parser, which is probably what makes the list order deterministic in the other tests.
I tried adding the parser to this test, and while it did fix the order, I ran into an issue where Assert.assertEquals(4, spec.getDataSchema().getAggregators().length); returned a length of 0. I'm not entirely sure why, as I’m not very familiar with the Kafka-specific logic.
The sort function I created could be used as a general-purpose tool to ensure deterministic ordering across tests, if needed. In fact, most of the tests only rely on assert.contains, except for one AssertEquals. Since the sort function only arranges the list elements and doesn’t alter their content, it shouldn’t cause problems, especially since the parser itself was handling ordering in other cases.
There was a problem hiding this comment.
Could you point to a test failure when this happened without instrumentation?
The sort function I created could be used as a general-purpose tool to ensure deterministic ordering across tests
I don't think that would be a good path forward - it reorders arrays without thinking about that may semantically be different...its ok to reorder if the array is say argument to an IN ; but if they are arguments to a function they may mean a different thing in a different order.
I think to fix these things for sure - the source should be addressed and not add extra lenient comparision stuff to the tests.
I'm not sure I was able to find the answers for my previous questions above:
- could you point to the field which's comparision is the root cause of these failures? are there any options there to alter the production code to be more stable?
- have you considered/evaluated making the
KafkaSupervisorSpec/etc comparable - so that instead of resorting to string comparision application level comparision can be made?
There was a problem hiding this comment.
could you point to the field which's comparision is the root cause of these failures? are there any options there to alter the production code to be more stable? - The output of dimensionSpec was non-deterministic because the parser wasn’t added to dimensionSpec, which led to ordering issues. I’ve implemented a fix by adding a parser, similar to how other tests are set up, which resolves the non-deterministic ordering problem. I’ve updated the PR with this fix—please review it and let me know if it looks good. Now, the array's ordering remains unchanged, but the dimension order is now deterministic.
There was a problem hiding this comment.
@kgyrtkirk Do you have some progress on inspecting this? I have a similar change for org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpecTest.testSerdeWithSpecAndInputFormat and don't know whether to add to this PR or open another PR. Thanks.
c1f8369 to
c4f4670
Compare
c4f4670 to
faf9929
Compare
|
I'm a little puzzled seeing that the do you know how the inconsistency could happen with the non-deprecated codepath? the comparision of which field leads to the issue? |
|
@kgyrtkirk Initially, I had resolved the issue by sorting the contents of the JSON. However, since the ordering of elements is important, I later updated the solution to align with how other tests handled ordering. The parser we were using provided a deterministic order, which ensured the test passed. Now that the parser is deprecated, one possible solution could be to avoid relying on string equality for comparisons and instead perform a direct JSON comparison. What is your opinion on this one? |
|
I think this is not a test-only issue; its more like the system fails to return stable results I think it should be tracked down how the inconsistency could happen with the non-deprecated codepath. Seems like the comparision of you will be most likely be able stabilize the test by making changes like |
|
This pull request has been marked as stale due to 60 days of inactivity. |
|
This pull request/issue has been closed due to lack of activity. If you think that |
Fixed a non deterministic test in
org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpecTest.testSerdeWithInputFormatSteps to reproduce
To reproduce the problem, first build the module
kafka-indexing-service:Then, run the regular test:
To identify the flaky test, execute the following nondex command:
Description
The testSerdeWithInputFormat was producing non-deterministic output because the JSON object used in the test did not include a parser that enforced a specific order, unlike other tests. Due to the inherent non-deterministic nature of JSON serialization, this resulted in inconsistent outputs. The dimension spec array gave non deterministic order on different runs as follow-
expected:
"dimensionExclusions":["__time","value_max","count","value_min","value_sum","value]","timestamp"]actual:
"dimensionExclusions":["value_sum","value","count","__time","value_min","value_max]","timestamp"].The fix involves adding a parser to the JSON like the other tests to ensure a consistent order.
This PR has: