Skip to content

Conversation

@caiconghui
Copy link
Contributor

@caiconghui caiconghui commented Oct 19, 2020

Proposed changes

  1. Make some debug log settings configurable.
  2. Change some log level from info to debug to escape performance bottlenecks
  3. Most of output log remains compatible with the previous one except the log level, but user can close it by verbose log settings

Types of changes

What types of changes does your code introduce to Doris?
Put an x in the boxes that apply

  • [] Bugfix (non-breaking change which fixes an issue)
  • [] New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [] Documentation Update (if none of the other choices apply)
  • [] Code refactor (Modify the code structure, format the code, etc...)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

…rom info to debug to escape performance bottlenecks
@caiconghui caiconghui changed the title Make some debug log settings configurable and change some log level from info to debug to escape performance bottlenecks Make some debug log settings configurable and change some log level from info to debug to avoid performance bottlenecks Oct 22, 2020
@kangkaisen
Copy link
Contributor

Change some log level from info to debug to escape performance bottlenecks

Are you sure the log is the FE performance bottlenecks? did you do any test or benchmark?

@caiconghui
Copy link
Contributor Author

caiconghui commented Oct 23, 2020

Change some log level from info to debug to escape performance bottlenecks

Are you sure the log is the FE performance bottlenecks? did you do any test or benchmark?

the heavy work is still stream load and tablet schedule in fe, but log in FrontendServiceImpl make thing worse when cpu load is higher, we found that all stream load request failed when qps is higher because all threads(the limit is 4096) in fe blocked in db lock or log lock, and cannot process any request. just like the issue(#4765) mentioned.

and I test the log performance in local machine with 8 logic core.

           LOG.info("start to test bench mark for log info");
            int totalThread = 4000;
            TStreamLoadPutRequest request = new TStreamLoadPutRequest("test", "test", "test", "test", null, 1111L, null, null);
            CountDownLatch cdl = new CountDownLatch(totalThread);
            long startTime = System.currentTimeMillis();
            for (int i = 0; i < totalThread; i++) {
                new Thread(new Runnable() {
                    @Override
                    public void run() {
                        for (int j = 0; j < 5000; j++) {
                            LOG.info("receive stream load put request. db:{}, tbl: {}, txn id: {}, load id: {}, backend: {}",
                                    request.getDb(), request.getTbl(), request.getTxnId(), DebugUtil.printId(request.getLoadId()),
                                    "localhost");
                        }
                        cdl.countDown();
                    }
                }).start();
            }
            cdl.await();
            long endTime = System.currentTimeMillis();
            LOG.info("finish to test bench mark log info cost {} ms", endTime - startTime);
            System.exit(-1);

it cost about 149388ms, every log cost about 0.075ms.
by the way, our fe log is heavy because of frequent stream load request which may have many rpc request, so we set the log in
FrontendServiceImpl from info level to debug level, but still keep it open, so that other user can still see the log, but we can close the log switch in FrontendServiceImpl which is useless for use at most of time.

Maybe the title is not accurate description for this PR.

@kangkaisen
Copy link
Contributor

@caiconghui I see. Thank you.

Copy link
Contributor

@kangkaisen kangkaisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kangkaisen kangkaisen added the approved Indicates a PR has been approved by one committer. label Oct 23, 2020
@kangkaisen kangkaisen merged commit a61eea3 into apache:master Oct 26, 2020
@yangzhg yangzhg mentioned this pull request Feb 9, 2021
@caiconghui caiconghui deleted the debug_log branch August 23, 2023 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. kind/performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Log4j RollingFileAppender performance sometimes be the bottlenecks for high qps of fe thrift server

3 participants