Skip to content

Conversation

@athanatos
Copy link

@athanatos athanatos commented Jan 25, 2019

  • Add test for cxid rollover to 1
  • Modify ClientCnxn.SendThread.getXid() to increment from MAX to 1.

@eolivelli
Copy link
Contributor

Interesting.
Are you interested in a back port to 3.4, 3.5?

@athanatos
Copy link
Author

Yeah, I've got the 3.4 one ready, 3.5 presumably also easy.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome work @athanatos

I left a couple of comments, please take a look

* the server. Thus, getXid() must be public.
*/
synchronized public int getXid() {
// xid values of -4, -2, and -1 are special, see SendThread.readResponse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about introducing constants for these special values ? (and replace in readResponse)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do protocol level constants like that usually get put?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, that seems a bit intrusive for a patch I want to backport since it would also touch the server side. I suppose I could add it as an additional commit and only backport the actual fix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about having constants only locally to this file, this way it will be easier to understand the code and the patch wouldn't be so intrusive.
In the (hopefully near) future we will separate client size code from server side code so introducing new shared constants would be overkilling.

I don't fell strong about having such constants, it is only a thought/suggestion.

cc @anmolnar @lvfangmin @hanm

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to keep the patch minimal since grepping for those identifiers wouldn't tell you where those packets get sent. Also, the numerical values are useful to see here. I don't feel strongly about it though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem from my side, let's keep the patch simple

}
}

protected ClientCnxn createConnection(String chrootPath,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about overriding this method only in your new testcase?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured that the xid manipulation methods might be generally useful. Also, createClient test helper method explicitly creates a TestableZooKeeper, so I'd have to create another version of that, I think. Doesn't seem worthwhile.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
thank you

+1 (non binding)

@athanatos
Copy link
Author

Is this waiting on something from me?

@eolivelli
Copy link
Contributor

We need two committers to review and then merge.

Tagging @anmolnar

@athanatos
Copy link
Author

Anything I can do to move this along?

@eolivelli
Copy link
Contributor

@anmolnar can we include this in 3.5.5 and 3.4.14?

@anmolnar
Copy link
Contributor

@eolivelli 3.5.5 is fine, but 3.4.14 is already cut, I wouldn't touch it anymore.

We can make an exception if it's critical, but we need at least one more committer to review.

Copy link
Contributor

@nkalmar nkalmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice and simple fix, good test addition.

+1, thanks!

@athanatos athanatos force-pushed the forupstream/ZOOKEEPER-3253 branch from 49e93a5 to 3cd4e8a Compare February 25, 2019 19:09
@athanatos
Copy link
Author

Ok, I pushed a version that wraps from MAX to 1 avoiding negative values entirely. @phunt

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lgtm

@phunt
Copy link
Contributor

phunt commented Feb 27, 2019

So here's my current concern, which I have not had time yet to look at and I think should be addressed before this is committed - what happens on the server side? Is it allowing the cxid to "go back in time", if so is that ok, if not then what happens? (e.g. does it disco the client session?)

@athanatos
Copy link
Author

athanatos commented Feb 27, 2019

@phunt I was unable to find anything server side that actually compares cxids other than to check equality. I believe this patch to be safe so long as two requests with the same cxid on the same session are not live at the same time -- which a 31 bit cxid is probably sufficient to ensure. Note, the current implementation already permits wrapping from MAX to MIN. This behavior occurs regularly in our environment and the only misbehavior it seems to cause is that the requests submitted with a cxid value of -4, -2, or -1 hang or cause a reconnect. The patch also includes a test to exercise the behavior specifically.

@athanatos athanatos force-pushed the forupstream/ZOOKEEPER-3253 branch from 3cd4e8a to 381d167 Compare February 27, 2019 17:52
@athanatos
Copy link
Author

Pushed to fix the commit message -- forgot to update it before.

Copy link
Contributor

@phunt phunt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 looks good to me with a minor nit that should be addressed. I will then commit this.

We want this to go into master, seems like we should also commit to at least 3.5? Anyone with strong opinions re 3.4? (I would probably lean towards commiting there as well)

}
}, null);
Assert.assertTrue("setData should complete within 5s",
latch.await(5, TimeUnit.SECONDS));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is generally an anti-pattern in the tests and leads to flakey tests. e.g. we can see on some test environments that are oversubscribed. I'd suggest something very large, e.g. 30 seconds or just using the session timeout.

- Add test for cxid rollover to 1
- Modify ClientCnxn.SendThread.getXid() to increment from MAX to 1.
@athanatos athanatos force-pushed the forupstream/ZOOKEEPER-3253 branch from 381d167 to 904cc30 Compare March 4, 2019 19:21
@athanatos
Copy link
Author

@phunt I pushed a version with that changed. The JenkinsMaven failure doesn't appear related to this patch.

@phunt
Copy link
Contributor

phunt commented Mar 6, 2019

+1, lgtm. thanks @athanatos

I also reviewed the server side code and it looks like client xid "going back in time" (my earlier concern) is fine.

I did notice this however - notice negative client xids circumvent throttling! So we're fixing this bug as well by resetting the xid before it rolls over.

    public void incrOutstandingAndCheckThrottle(RequestHeader h) {
        if (h.getXid() <= 0) {
            return;
        }
        if (zkServer.shouldThrottle(outstandingCount.incrementAndGet())) {
            disableRecv(false);
        }
    }

@asfgit asfgit closed this in e10c93a Mar 6, 2019
asfgit pushed a commit that referenced this pull request Mar 6, 2019
- Add test for cxid rollover to 1
- Modify ClientCnxn.SendThread.getXid() to increment from MAX to 1.

Author: Samuel Just <sjust@salesforce.com>

Reviewers: phunt@apache.org

Closes #787 from athanatos/forupstream/ZOOKEEPER-3253

Change-Id: Ib3d111170bb086d6982f2cf0ee5cf8afd5157588
(cherry picked from commit e10c93a)
Signed-off-by: Patrick Hunt <phunt@apache.org>
asfgit pushed a commit that referenced this pull request Mar 6, 2019
- Add test for cxid rollover to 1
- Modify ClientCnxn.SendThread.getXid() to increment from MAX to 1.

Author: Samuel Just <sjust@salesforce.com>

Reviewers: phunt@apache.org

Closes #787 from athanatos/forupstream/ZOOKEEPER-3253

Change-Id: Ib3d111170bb086d6982f2cf0ee5cf8afd5157588
(cherry picked from commit e10c93a)
Signed-off-by: Patrick Hunt <phunt@apache.org>
athanatos pushed a commit to athanatos/zookeeper that referenced this pull request Mar 8, 2019
- Add test for cxid rollover to 1
- Modify ClientCnxn.SendThread.getXid() to increment from MAX to 1.

Author: Samuel Just <sjust@salesforce.com>

Reviewers: phunt@apache.org

Closes apache#787 from athanatos/forupstream/ZOOKEEPER-3253

Change-Id: Ib3d111170bb086d6982f2cf0ee5cf8afd5157588
(cherry picked from commit e10c93a)

Includes backport of createConnection testability refactor
from 9f82798.

Signed-off-by: Samuel Just <sjust@salesforce.com>
asfgit pushed a commit that referenced this pull request Mar 8, 2019
- Add test for cxid rollover to 1
- Modify ClientCnxn.SendThread.getXid() to increment from MAX to 1.

Author: Samuel Just <sjustsalesforce.com>

Reviewers: phuntapache.org

Closes #787 from athanatos/forupstream/ZOOKEEPER-3253

Change-Id: Ib3d111170bb086d6982f2cf0ee5cf8afd5157588
(cherry picked from commit e10c93a)

Includes backport of createConnection testability refactor
from 9f82798.

Signed-off-by: Samuel Just <sjustsalesforce.com>

Author: Samuel Just <sjust@salesforce.com>

Reviewers: phunt@apache.org

Closes #844 from athanatos/forupstream/ZOOKEEPER-3235-3.4
maoling pushed a commit to maoling/zookeeper that referenced this pull request Mar 13, 2019
- Add test for cxid rollover to 1
- Modify ClientCnxn.SendThread.getXid() to increment from MAX to 1.

Author: Samuel Just <sjust@salesforce.com>

Reviewers: phunt@apache.org

Closes apache#787 from athanatos/forupstream/ZOOKEEPER-3253

Change-Id: Ib3d111170bb086d6982f2cf0ee5cf8afd5157588
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Sep 3, 2022
- Add test for cxid rollover to 1
- Modify ClientCnxn.SendThread.getXid() to increment from MAX to 1.

Author: Samuel Just <sjust@salesforce.com>

Reviewers: phunt@apache.org

Closes apache#787 from athanatos/forupstream/ZOOKEEPER-3253

Change-Id: Ib3d111170bb086d6982f2cf0ee5cf8afd5157588
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants