Fixed the non deterministic test testSerdeWithInputFormat by yugvajani · Pull Request #17325 · apache/druid

yugvajani · 2024-10-10T22:46:14Z

Fixed a non deterministic test in org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpecTest.testSerdeWithInputFormat

Steps to reproduce

To reproduce the problem, first build the module kafka-indexing-service:

mvn install -pl extensions-core/kafka-indexing-service -am -DskipTests

Then, run the regular test:

mvn -pl extensions-core/kafka-indexing-service test -Dtest=org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpecTest#testSerdeWithInputFormat

To identify the flaky test, execute the following nondex command:

mvn -pl extensions-core/kafka-indexing-service edu.illinois:nondex-maven-plugin:2.1.7:nondex -Dtest=org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpecTest#testSerdeWithInputFormat

Description

The testSerdeWithInputFormat was producing non-deterministic output because the JSON object used in the test did not include a parser that enforced a specific order, unlike other tests. Due to the inherent non-deterministic nature of JSON serialization, this resulted in inconsistent outputs. The dimension spec array gave non deterministic order on different runs as follow-
expected: "dimensionExclusions":["__time","value_max","count","value_min","value_sum","value]","timestamp"]
actual: "dimensionExclusions":["value_sum","value","count","__time","value_min","value_max]","timestamp"] .

The fix involves adding a parser to the JSON like the other tests to ensure a consistent order.

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

kgyrtkirk · 2024-10-11T07:04:51Z

    KafkaSupervisorSpec spec2 = mapper.readValue(serialized, KafkaSupervisorSpec.class);

    String stable = mapper.writeValueAsString(spec2);
+    String sortedStableJson = sortJsonString(stable);


note: this seems like an unconditional resort at the Json level for only this test; but I wonder if it only affects this
from the repro steps: nondex have detected that the output here is unstable.

I don't know the Kafka part; but there might be lists/arrays for which order does matter - sorting all parts of a json unconditionally may possible hide issues.

I wonder if it was investigated what causes these mismatches. I suspect that some underlying class may use some unstable constructs - in most cases these used to boil down to a HashSet -> LinkedHashSet change...

Another possible way to fix this in a wider scope is to make the KafkaSupervisorSpec comparable and do the validation at that level; so that the real classes can decide if 2 maps/sets are equal or not ; and lists / arrays will still be validated correctly

what do you think about these approaches?

If an underlying class were using HashSet, other tests in the same file would likely have failed as well, since they all use the same underlying class. The key difference with testSerdeWithInputFormat is that it wasn’t using a parser, which is probably what makes the list order deterministic in the other tests.

I tried adding the parser to this test, and while it did fix the order, I ran into an issue where Assert.assertEquals(4, spec.getDataSchema().getAggregators().length); returned a length of 0. I'm not entirely sure why, as I’m not very familiar with the Kafka-specific logic.

The sort function I created could be used as a general-purpose tool to ensure deterministic ordering across tests, if needed. In fact, most of the tests only rely on assert.contains, except for one AssertEquals. Since the sort function only arranges the list elements and doesn’t alter their content, it shouldn’t cause problems, especially since the parser itself was handling ordering in other cases.

Could you point to a test failure when this happened without instrumentation?

The sort function I created could be used as a general-purpose tool to ensure deterministic ordering across tests

I don't think that would be a good path forward - it reorders arrays without thinking about that may semantically be different...its ok to reorder if the array is say argument to an IN ; but if they are arguments to a function they may mean a different thing in a different order.

I think to fix these things for sure - the source should be addressed and not add extra lenient comparision stuff to the tests.

I'm not sure I was able to find the answers for my previous questions above:

could you point to the field which's comparision is the root cause of these failures? are there any options there to alter the production code to be more stable?

have you considered/evaluated making the KafkaSupervisorSpec/etc comparable - so that instead of resorting to string comparision application level comparision can be made?

could you point to the field which's comparision is the root cause of these failures? are there any options there to alter the production code to be more stable? - The output of dimensionSpec was non-deterministic because the parser wasn’t added to dimensionSpec, which led to ordering issues. I’ve implemented a fix by adding a parser, similar to how other tests are set up, which resolves the non-deterministic ordering problem. I’ve updated the PR with this fix—please review it and let me know if it looks good. Now, the array's ordering remains unchanged, but the dimension order is now deterministic.

@kgyrtkirk Do you have some progress on inspecting this? I have a similar change for org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpecTest.testSerdeWithSpecAndInputFormat and don't know whether to add to this PR or open another PR. Thanks.

kgyrtkirk · 2024-11-28T10:06:15Z

I'm a little puzzled seeing that the parser seem to have been deprecated ; and the PR which did introduced timespampSpec was the one deprecating that in #8823.
If the fix would be to move from a deprecated one to a non-deprecated one I wouldn't had any questions/etc...but its the other way around

do you know how the inconsistency could happen with the non-deprecated codepath? the comparision of which field leads to the issue?
I still feel like this is just masking the issue and not stabilizes the non-deprecated system to be "stable" - so that it produces the same results with different java versions/etc.

yugvajani · 2024-12-16T01:46:46Z

@kgyrtkirk Initially, I had resolved the issue by sorting the contents of the JSON. However, since the ordering of elements is important, I later updated the solution to align with how other tests handled ordering. The parser we were using provided a deterministic order, which ensured the test passed.

Now that the parser is deprecated, one possible solution could be to avoid relying on string equality for comparisons and instead perform a direct JSON comparison. What is your opinion on this one?

kgyrtkirk · 2024-12-17T12:59:42Z

I think this is not a test-only issue; its more like the system fails to return stable results

I think it should be tracked down how the inconsistency could happen with the non-deprecated codepath.

Seems like the comparision of dimensionExclusions fails? where is that field declared? what typed object it gets?

you will be most likely be able stabilize the test by making changes like HashSet to LinkedHashSet or something similar - you may also need to update the test; but after that it will not happen anymore.

github-actions · 2025-02-16T00:23:11Z

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

github-actions · 2025-03-16T00:24:25Z

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

github-actions Bot added the Area - Streaming Ingestion label Oct 10, 2024

kgyrtkirk reviewed Oct 11, 2024

View reviewed changes

yugvajani changed the title ~~Fixed the flaky test testSerdeWithInputFormat by sorting the result~~ Fixed the non deterministic test testSerdeWithInputFormat Oct 28, 2024

yugvajani force-pushed the fix-flaky-test branch from c1f8369 to c4f4670 Compare October 28, 2024 15:32

Fixed the non deterministic test testSerdeWithInputFormat

faf9929

yugvajani force-pushed the fix-flaky-test branch from c4f4670 to faf9929 Compare October 28, 2024 15:33

github-actions Bot added the stale label Feb 16, 2025

github-actions Bot closed this Mar 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed the non deterministic test testSerdeWithInputFormat#17325

Fixed the non deterministic test testSerdeWithInputFormat#17325
yugvajani wants to merge 1 commit intoapache:masterfrom
yugvajani:fix-flaky-test

yugvajani commented Oct 10, 2024 •

edited

Loading

Uh oh!

kgyrtkirk Oct 11, 2024

Uh oh!

yugvajani Oct 17, 2024

Uh oh!

kgyrtkirk Oct 28, 2024

Uh oh!

yugvajani Oct 28, 2024

Uh oh!

yugvajani Nov 5, 2024

Uh oh!

kgyrtkirk commented Nov 28, 2024

Uh oh!

yugvajani commented Dec 16, 2024 •

edited

Loading

Uh oh!

kgyrtkirk commented Dec 17, 2024

Uh oh!

github-actions Bot commented Feb 16, 2025

Uh oh!

github-actions Bot commented Mar 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yugvajani commented Oct 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Steps to reproduce

Description

Uh oh!

kgyrtkirk Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

yugvajani Oct 17, 2024

Choose a reason for hiding this comment

Uh oh!

kgyrtkirk Oct 28, 2024

Choose a reason for hiding this comment

Uh oh!

yugvajani Oct 28, 2024

Choose a reason for hiding this comment

Uh oh!

yugvajani Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

kgyrtkirk commented Nov 28, 2024

Uh oh!

yugvajani commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kgyrtkirk commented Dec 17, 2024

Uh oh!

github-actions Bot commented Feb 16, 2025

Uh oh!

github-actions Bot commented Mar 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yugvajani commented Oct 10, 2024 •

edited

Loading

yugvajani commented Dec 16, 2024 •

edited

Loading