Skip to content

KAFKA-14099 - Fix request logs in connect#12434

Merged
C0urante merged 4 commits intoapache:trunkfrom
zigarn:fix-connect-request-logs
Oct 12, 2022
Merged

KAFKA-14099 - Fix request logs in connect#12434
C0urante merged 4 commits intoapache:trunkfrom
zigarn:fix-connect-request-logs

Conversation

@zigarn
Copy link
Copy Markdown
Contributor

@zigarn zigarn commented Jul 23, 2022

KAFKA-14099

Restore request logs in connect. RequestLogHandler seems to not work.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@vamossagar12
Copy link
Copy Markdown
Contributor

thanks for the changes @zigarn . The changes look fine to me.. @C0urante can you also plz review? Thanks!

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zigarn, and apologies for breaking this in the first place!

I've verified the change locally against the standard REST API and against the admin API, both with the same listener and with separate listeners. All requests were logged successfully.

That said, it'd still be nice to see some tests for this, to prevent the same kind of accidental regression that was introduced in #6651. One possible approach is to use a LogCaptureAppender to programmatically check on logging output during tests. This is used in #10528, which hasn't been merged yet but has been approved.

If that approach works for you, I can rebase #10528 and see about getting it merged sometime soon so that you can leverage the LogCaptureAppender class in a test of your own. If there's an easier way to test these changes, I'm happy to review that too.

@zigarn
Copy link
Copy Markdown
Contributor Author

zigarn commented Jul 26, 2022

Thanks @C0urante for the feedback.
No problem for the issue in the first place, I guess not many people were impacted by this as nobody raised this issue.

I'm more concerned about why it's not working in the current way to do it. I wonder if the DefaultHandler is working either? From my understanding, looks like context and adminContext are working because they are explicitly started, but I tried to do the same with requestLogHandler with no luck.

I was wondering if there was a way to test this, so if you say there is with LogCaptureAppender, let's go for it!

@zigarn
Copy link
Copy Markdown
Contributor Author

zigarn commented Jul 26, 2022

@C0urante In the meantime I added a test by copying the LogCaptureAppender from #10528.

Not the best with the while loop to wait for the log to be created in the completion phase. A more robust solution would be to create an handler that will do the test in the completion phase, but this will need to expose the RestServer.handlers. I'll see if something can work.

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zigarn, nice to have tests for this! Hopefully we can get #10528 merged sometime soon. It's worth noting that there are Checkstyle failures in the most recent CI build because there are some disallowed imports; this is addressed in #10528 and should be automatically resolved if, after that PR is merged, you rebase on top of the trunk branch. If you'd like CI builds to pass before then, consider adding the changes to the checkstyle/import-control.xml file from #10528.

I have some suggestions for the tests. I agree that the while loop approach is not great and hopefully we can improve things, but it's still valuable even its current state as a proof-of-concept for the log verification stuff. Thanks again for adding that, by the way!

RE the DefaultHandler, based on the Jetty docs, I actually think it's unnecessary and we should stop using one altogether. The default handler does a few things:

  1. Serves the Jetty icon for requests to /favicon.ico. This might be nice to have but it's unlikely that anyone's depending on it in any meaningful way.
  2. Serves a 404 response with a list of known contexts for requests to /. This is unlikely to come in handy since we already define our own root resource. It might be useful for requests to the root resource for a separate admin listener, but since that feature was added after this bug was introduced, we wouldn't make things any worse by leaving out the default handler for the admin listener for now.
  3. Serves a 404 response for all other requests. We already do this today; no need to add the default handler (or any other handler) for it.

What do you think?

@zigarn
Copy link
Copy Markdown
Contributor Author

zigarn commented Jul 28, 2022

Thanks for this thorough feedback @C0urante!

Fixed the checkstyle/import-control.xml, did revert too much after a lot of testing and importing new packages.

About the DefaultHandler, looks like it's currently not working anyway and didn't find a way to make it work... From my debugging, looks like the context handler is putting the request in HANDLED status and next handlers are skipped. Maybe an issue by mixing simple handler and context handlers? Maybe DefaultHandler should be set as an inner handler of the context handlers? I have to say that it's not easy to get into all this handler mechanism of Jetty.

I'll add the handling of autoclosable client and responses.

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good @zigarn! The new test case is super smooth now.

I made an inline comment but just wanted to reiterate that we can probably drop the DefaultHandler entirely. It hasn't had any effect since the change that broke request logging was merged, and it wasn't doing that much before then.

FWIW, I too find the Jetty API to be a little opaque. Had to run some of this stuff through a debugger to get a decent sense of the various moving parts at play.

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zigarn, this is really close!

I think there might be a cleaner way to track our responses that doesn't require try-with-resource blocks within each test case. We can declare a class-level Collection<CloseableHttpResponse> responses field that gets initialized during setUp for each test case, and then during tearDown, close every response object in that collection.

We can also introduce a utility method that takes care of executing the request for us and adding the response to responses for us:

private CloseableHttpResponse executeRequest(HttpHost httpHost, HttpRequest request) throws IOException {
    CloseableHttpResponse result = httpClient.execute(httpHost, request);
    responses.add(result);
    return result;
}

or, if we want to simplify things further and obviate the need for tests to manually define a HttpHost object:

private CloseableHttpResponse executeRequest(URI uri, HttpRequest request) throws IOException {
    HttpHost host = new HttpHost(uri.getHost(), uri.getPort());
    CloseableHttpResponse result = httpClient.execute(host, request);
    responses.add(result);
    return result;
}

We can then replace all direct calls to httpClient::execute with executeRequest in our test cases.

What do you think?

Comment on lines 88 to 92
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we shouldn't fail the test if this fails? It may indicate an issue with the server if we can't close a response or a client.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit more complicated to correctly close everything but still raising errors.
Or we could just fail everything on first closing issue?

Added commit for 1st option. Can go for a fail-fast option if you prefer.

Copy link
Copy Markdown
Contributor

@C0urante C0urante Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fail-fast is fine. It may cause resource leaks during test runs, but if we do our due diligence and make sure tests pass before merging changes, this should never happen on trunk or backport branches, and should only affect local runs while iterating on changes, in which case it's easier to do more targeted testing by isolating to a single Gradle module, test class, or even test case, and leaked resources should matter less.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great @zigarn, one more small comment!

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your diligent work on this @zigarn, LGTM!

I'll see if I can get some more eyes on #10528, and if it gets merged, you may have to rebase this PR. If we don't hear back on that one, we might be able to merge this as-is, but I'd prefer not to have two copies of the LogCaptureAppender class in the code base at the same time if at all possible.

@zigarn
Copy link
Copy Markdown
Contributor Author

zigarn commented Jul 30, 2022

@C0urante: what about using the existing scala implementation kafka.utils.LogCaptureAppender?
I added a commit to try it and it works. May add some methods in scala implementation for easier use in java tests (instead of doing conversions in java test classes), could also benefit for #10528 in reverse? WDYT?

@zigarn zigarn requested a review from C0urante July 30, 2022 10:23
@C0urante
Copy link
Copy Markdown
Contributor

C0urante commented Aug 1, 2022

@zigarn I think we try to avoid adding dependencies from the core module into non-broker modules with few exceptions (such as spinning up embedded Kafka clusters for integration tests), but I can ask. I did some research and it appears the Java variant was introduced a few months before the Scala one, so it's possible that we haven't made the switch yet just because nobody's proposed it.

@C0urante
Copy link
Copy Markdown
Contributor

Apologies for the delay @zigarn, and thank you for sticking with this.

I'm still not sure we should be relying on the Scala LogCaptureAppender, but I don't want to ask you to put more work into this PR (you've already done plenty!), and I don't want to block this PR on someone else re-reviewing #10528.

What I'd like to do is merge this PR, then on #10528:

  • Rebase onto the latest trunk
  • Revert the changes to the Scala LogCaptureAppender introduced in this PR
  • Update the RestServerTest to use the Java LogCaptureAppender
  • Merge KAFKA-12497: Skip periodic offset commits for failed source tasks #10528 (it's already been approved and, although I was hoping we could get another look at it, it's technically acceptable to merge it now as-is)

Is that alright with you? And again, thank you for sticking with this PR, it really is a valuable improvement and I'd like to see it merged.

@zigarn
Copy link
Copy Markdown
Contributor Author

zigarn commented Oct 12, 2022

@C0urante No problem. Your scenario is fine by me.

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @zigarn!

@C0urante C0urante merged commit 6291412 into apache:trunk Oct 12, 2022
C0urante added a commit to C0urante/kafka that referenced this pull request Oct 12, 2022
@zigarn zigarn deleted the fix-connect-request-logs branch October 14, 2022 06:04
guozhangwang pushed a commit to guozhangwang/kafka that referenced this pull request Jan 25, 2023
Reviewers: Chris Egerton <chrise@aiven.io>
rutvijmehta-harness pushed a commit to rutvijmehta-harness/kafka that referenced this pull request Feb 9, 2024
Reviewers: Chris Egerton <chrise@aiven.io>
rutvijmehta-harness added a commit to rutvijmehta-harness/kafka that referenced this pull request Feb 9, 2024
Reviewers: Chris Egerton <chrise@aiven.io>

Co-authored-by: Alexandre Garnier <zigarn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants