Skip to content

KAFKA-7885: TopologyDescription violates equals-hashCode contract.#6210

Merged
mjsax merged 5 commits intoapache:trunkfrom
jCalamari:KAFKA-7885
Apr 15, 2020
Merged

KAFKA-7885: TopologyDescription violates equals-hashCode contract.#6210
mjsax merged 5 commits intoapache:trunkfrom
jCalamari:KAFKA-7885

Conversation

@jCalamari
Copy link
Copy Markdown
Contributor

@jCalamari jCalamari commented Jan 30, 2019

Overwrote hash code implementation in StaticTopicNameExtractor.
Also overwrote equals to pass checkstyle.
Added a unit test confirming equals-hashCode contract.

Comment thread streams/src/test/java/org/apache/kafka/streams/TopologyTest.java Outdated
@jCalamari jCalamari force-pushed the KAFKA-7885 branch 3 times, most recently from 810f7ec to 82b7136 Compare January 30, 2019 21:02
@mjsax mjsax added the streams label Feb 1, 2019
Copy link
Copy Markdown
Member

@mjsax mjsax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! And thanks for the PR!

One nit. Overall LGTM.

Call for second review @guozhangwang @bbejeck @vvcephei @ableegoldman

@vvcephei
Copy link
Copy Markdown
Contributor

vvcephei commented Feb 2, 2019

Hi @jCalamari , thanks for this PR and the Jira.

I'm in favor of the change, but I'd like to make sure I understand the bug report...

Looking at the diff here, I don't see how the prior code was in violation of the contract.
Also, the test doesn't enforce the equals-hashCode contract, just that hashcode is always the same when the extractor is created with the same topic name.

Can you elaborate, please?

Thanks,
-John

@jCalamari
Copy link
Copy Markdown
Contributor Author

jCalamari commented Feb 4, 2019

Hi John,

many thanks for reviewing the PR. This test line fails on latest apache/trunk branch:

Assert.assertEquals(description1.hashCode(), description2.hashCode());

whereas this succeeds:

Assert.assertEquals(description1, description2);

As per Java documentation, this is equlas-hashcode contract violation:

objects which are equal must have the same hashCode

Also, I don't mind changing the test to make violation more obvious, any suggestions?

Thanks,
Piotr

@vvcephei
Copy link
Copy Markdown
Contributor

vvcephei commented Feb 4, 2019

Hi @jCalamari ,

Thanks for the response!

Sorry for saying that your test doesn't test the contract. I see now that it does. I'd misread it.

It surprised me that the two description instances would be considered equal before your change, since when equals is not implemented, it falls back to identity, which two separately constructed object clearly won't have. You'd generally expect classes that don't implement equals or hashcode to trivially obey the contract by having different hashcodes and not being equal.

I dug into it, and it turns out that two TopologyDescriptions are equal if their subtopologies sets are equal, but this doesn't behave the way you'd expect. The subtopologies are TreeSets which, wind up using only the comparator, and, sadly, the Subtopology comparator implementation only considers the subtopology id.

Therefore, no only are the two topologies in your test equal, but so are any two topologies with the same number of subtopologies!

I couldn't believe it, so I tried it:

        final StreamsBuilder streamsBuilder = new StreamsBuilder();
        streamsBuilder.stream("a").to("b");
        final TopologyDescription describe = streamsBuilder.build().describe();

        final StreamsBuilder streamsBuilder1 = new StreamsBuilder();
        streamsBuilder1.stream("c").to("d");
        final TopologyDescription describe1 = streamsBuilder1.build().describe();

        final boolean equals = describe.equals(describe1);
        // they are equal!

Of course, these two topologies will still have different hashcodes, even with your patch.

@jCalamari
Copy link
Copy Markdown
Contributor Author

jCalamari commented Feb 4, 2019

Hi @vvcephei , very interesting finding!

I can think of the following solutions, based on your feedback:

  • improve subtopology comparator if possible,
  • dirty hack: reimplement TopologyDescription equals/hashCode in terms of its toString().equals/hashCode methods
  • document unusual behaviour of TopologyDescription equals/hascode,
  • ignore the problem.

@vvcephei
Copy link
Copy Markdown
Contributor

vvcephei commented Feb 4, 2019

:) This makes me wonder how many others have already walked this path and chosen option 3 ;)

I'll leave the choice between 1 and 2 up to you.

I think being able to compare topologies for equality could be useful, but it could also be misleading. For example, you can alter the function passed to a map operation, which changes the topology in a meaningful way, but doesn't alter the TopologyDescription. Nevertheless, it does still seem like a handy tool used properly. So, I think I'd be in favor of a PR to add this possibility. But on the other hand, it seems like a much bigger task than what you originally signed up for.

At the least, we should document it, so the next person doesn't fall in the same pit. Actually, we can do better than that, if we just throw an exception inside of TopologyDescription equals and/or hashcode methods.

Thanks again for bringing this up!

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jCalamari thanks for the patch!

Having thought on this some more, while in the strictest of terms this PR is an improvement, it makes more sense to implement the changes found here and here to make for a complete change, so maybe we should not merge until all pieces are in place.

@jCalamari
Copy link
Copy Markdown
Contributor Author

Hi @bbejeck ,

I completely agree with you, I might give it a go over this weekend. Many thanks

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Feb 10, 2019

@jCalamari Can you update the Jira tickets with the new finding (or create new ones)?

@jCalamari
Copy link
Copy Markdown
Contributor Author

jCalamari commented Feb 10, 2019

Hi @vvcephei , @mjsax
I am biased towards implementing toString() based equality for TopologyDescription, demo-code:

    public final static class TopologyDescription implements org.apache.kafka.streams.TopologyDescription {

        @Override
        public boolean equals(final Object o) {
            if (this == o) {
                return true;
            }
            if (o == null || getClass() != o.getClass()) {
                return false;
            }

            final TopologyDescription that = (TopologyDescription) o;
            return this.toString().equals(that.toString());
        }

        @Override
        public int hashCode() {
            return this.toString().hashCode();
        }
}

Advantages:

  • much easier code
  • doesn't require comparing subtopologies

Disadvantages:

  • equals and hashCode depend on toString

Throwing an exception inside of TopologyDescription equals and/or hashcode methods is also an option.

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Feb 13, 2019

I think we should do the "right thing". Easier code does not seem to be a good argument here.

@jCalamari
Copy link
Copy Markdown
Contributor Author

@mjsax Shall we close this PR and open new one, once I figure out the "right thing"?

@vvcephei
Copy link
Copy Markdown
Contributor

Hi @jCalamari ,

That's a clever solution, but I have to agree with @mjsax .

Another advantage of the toString approach is that, if they look the same, they are the same.

But on the disadvantage side, it's generally an anti-pattern to incorporate the output of toString() in program logic. The contract of the method is:

     * Returns a string representation of the object. In general, the
     * {@code toString} method returns a string that
     * "textually represents" this object. The result should
     * be a concise but informative representation that is easy for a
     * person to read.

I believe this is generally interpreted as meaning that toString is for humans, not programs, to read. relevant for our case here, toString may elide information that is important for equality in favor of having a "nice", readable output.

To be fair, this anti-pattern is in wide use, including in the Java core, see Duration, for example, where you're encouraged to Duration.parse(Duration.toString()). Still...

Another reason not to implement equality/hashcode with toString is that, users can trivially do the same thing on the "outside" of the library, by just using topology.describe().toString() for their comparisons instead of topology.describe().

I've previously sanity-checked the idea of throwing exceptions like UnsupportedOperationException in equals/hashcode. It doesn't violate any contracts that I can tell, but it might violate people's expectations.

One alternative (which would also resolve KAFKA-7885) is just to remove the equals/hashcode implementations. Then, TopologyDescription would fall back to the default (correct) implementation of identity equality/hashcode.

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Feb 19, 2019

@jCalamari

Shall we close this PR and open new one, once I figure out the "right thing"?

I would just update this PR. There is not much to "figure out" IMHO -- we should add .hashCode() and .equals() to all classes for which we need it.

@jCalamari
Copy link
Copy Markdown
Contributor Author

jCalamari commented Feb 22, 2019

Hi @mjsax @vvcephei

I have removed default hashCode/equals implementations from TopologyDescription and run into other problems. There is bunch of failed tests in TopologyTest which relay on equality of TopologyDescription. I have replaced equality test with toString equality and found out description created by topology.describe() is not symmetric to hand-crafted topology description. The problem lays around this line: https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopologyBuilder.java#L1349

topology.describe() goes through predecessor nodes first, whereas hand-crafted description implements natural ordering with NODE_COMPARATOR. Despite my efforts, some of these tests are still failing.

Any thoughts?

@vvcephei
Copy link
Copy Markdown
Contributor

vvcephei commented Mar 8, 2019

Hey, @jCalamari ,

That's a conundrum. It would be fine, IMHO, to modify the test logic and the Streams logic to get it to come out the same. But I understand this might not be as easy as it sounds.

I had a (probably bad) idea that in those cases, maybe we can just split the string on \n, sort the result, and just verify that the two topologies have all the same lines. It's not perfect, but maybe it's good enough until we actually implement equals/hashcode properly.

So, I tried this for the tests that are still failing:

    private void verify(final TopologyDescription actual, final InternalTopologyBuilder.TopologyDescription expected) {
        final String[] actuals = actual.toString().split("\n");
        Arrays.sort(actuals);
        final String[] expecteds = expected.toString().split("\n");
        Arrays.sort(expecteds);
        assertThat(actuals, equalTo(expecteds));
    }

Funny enough, I still get a couple of test failures, and it looks like there's either a flaw in the tests, or a bug in Streams:

Expected :["      --> none", "    Source: source (topics: topic[0-9])", "   Sub-topology: 0", "Topologies:"]
Actual   :["      --> none", "    Source: source (topics: [])", "   Sub-topology: 0", "Topologies:"]

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Jun 1, 2019

@jCalamari -- Just cycling back to this PR -- any update on this? We recently changes some compare() method that a used on the TreeMap (cf 6d3ff13) -- not sure atm if this addresses the root cause.

I still think, we should not remove hashCode() and equals() from TopologyDescription and should not use toString() -- can you maybe check out the latest code and verify if the issue is still there? To test that the contract is not violated, we might want to add assertEquals(description1.hashCode(), description2.hashCode()) to each test case.

@PeterFras
Copy link
Copy Markdown

Hey @mjsax

Mind having a quick look at the PR again?

Many thanks

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Apr 15, 2020

jCalamari @PeterFras Sorry that I dropped this PR. Just came across this again. LGTM. Thanks a lot for your contribution!

Btw: if this happens in the future (ie, your PR gets no attention), feel free to ping us on a weekly basis to get our attention back. Kafka is a very busy project and if you don't "fight" for attention, it can easily happen that a PR sits there for many month... :(

@mjsax mjsax merged commit f7d2b1b into apache:trunk Apr 15, 2020
Kvicii pushed a commit to Kvicii/kafka that referenced this pull request Apr 19, 2020
* 'trunk' of github.com:apache/kafka: (28 commits)
  MINOR: cleanup RocksDBStore tests  (apache#8510)
  KAFKA-9818: Fix flaky test in RecordCollectorTest (apache#8507)
  MINOR: reduce impact of trace logging in replica hot path (apache#8468)
  KAFKA-6145: KIP-441: Add test scenarios to ensure rebalance convergence (apache#8475)
  KAFKA-9881: Convert integration test to verify measurements from RocksDB to unit test (apache#8501)
  MINOR: improve test coverage for dynamic LogConfig(s) (apache#7616)
  MINOR: Switch order of sections on tumbling and hopping windows in streams doc. Tumbling windows are defined as "special case of hopping time windows" - but hopping windows currently only explained later in the docs. (apache#8505)
  KAFKA-9819: Fix flaky test in StoreChangelogReaderTest (apache#8488)
  HOTFIX: fix active task process ratio metric recording
  KAFKA-9796; Ensure broker shutdown is not stuck when Acceptor is waiting on connection queue (apache#8448)
  MINOR: Use streaming iterator with decompression buffer when building offset map (apache#8494)
  Add log message in release.py (apache#8461)
  KAFKA-9854 Re-authenticating causes mismatched parse of response (apache#8471)
  KAFKA-9838; Add log concurrency test and fix minor race condition (apache#8476)
  KAFKA-9703; Free up compression buffer after splitting a large batch
  KAFKA-9779: Add Stream system test for 2.5 release (apache#8378)
  KAFKA-7885: TopologyDescription violates equals-hashCode contract. (apache#6210)
  MINOR: KafkaApis#handleOffsetDeleteRequest does not group result correctly (apache#8485)
  HOTFIX: don't close or wipe out someone else's state (apache#8478)
  MINOR: add process(Test)Messages to the README (apache#8480)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants