Skip to content

Conversation

@zy-kkk
Copy link
Member

@zy-kkk zy-kkk commented Jan 13, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #45806

Problem Summary:

In the previous BE processing of JDBC Driver, the Driver jar will be downloaded to the local cache directory first, and the cached jar package will be provided to the JVM for loading. This will cause two problems

  1. The jar package in the cache may fail to load due to duplication
  2. Frequent repeated loading of the driver by the JVM will cause Compressed class space OOM

In order to fix these two problems, this PR has the following changes

  1. Remove the logic of BE downloading the Driver Jar to the local cache directory, and directly hand over the original path to Java's Classloader for processing
  2. Treat the jar packages with the same name in the same path as one and cache them in the map to avoid repeated loading and cause Compressed class space OOM

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 13, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

std::string driver_path;

if (_conn_param.driver_path.find(":/") == std::string::npos) {
driver_path = "file://" + config::jdbc_drivers_dir + "/" + _conn_param.driver_path;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driver path maybe an absolute path like: /mnt/disk1/path/to/1.jar, how to handle it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a path check on the FE side. The Driver URL is always sent by FE to BE, so BE is only responsible for processing it.

@zy-kkk zy-kkk marked this pull request as ready for review February 11, 2025 09:17
@zy-kkk
Copy link
Member Author

zy-kkk commented Feb 11, 2025

run buildall

@doris-robot
Copy link

TeamCity cloud ut coverage result:
Function Coverage: 82.38% (1061/1288)
Line Coverage: 65.86% (17547/26643)
Region Coverage: 65.35% (8651/13237)
Branch Coverage: 55.22% (4662/8442)
Coverage Report: http://coverage.selectdb-in.cc/coverage/70a8f5b91c79f4083cd92b5137e25d5c91d1e47d_70a8f5b91c79f4083cd92b5137e25d5c91d1e47d_cloud/report/index.html

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 42.81% (11263/26308)
Line Coverage: 32.78% (94598/288570)
Region Coverage: 31.95% (48505/151820)
Branch Coverage: 27.82% (24462/87944)
Coverage Report: http://coverage.selectdb-in.cc/coverage/70a8f5b91c79f4083cd92b5137e25d5c91d1e47d_70a8f5b91c79f4083cd92b5137e25d5c91d1e47d/report/index.html

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Feb 12, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@zy-kkk zy-kkk changed the title [opt](jdbc catalog) Change jdbc Driver loading to Java code [fix](jdbc catalog) Change jdbc Driver loading to Java code Feb 12, 2025
@zy-kkk zy-kkk changed the title [fix](jdbc catalog) Change jdbc Driver loading to Java code [fix](jdbc catalog) Change BE jdbc Driver loading to Java code Feb 12, 2025
@zy-kkk zy-kkk merged commit e2ff89e into apache:master Feb 12, 2025
32 of 35 checks passed
@zy-kkk zy-kkk deleted the jdbc_driver_load branch February 12, 2025 07:55
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Feb 18, 2025
…e#46912)

Related PR: apache#45806

Problem Summary:

In the previous BE processing of JDBC Driver, the Driver jar will be downloaded to the local cache directory first, and the cached jar package will be provided to the JVM for loading. This will cause two problems
1. The jar package in the cache may fail to load due to duplication
2. Frequent repeated loading of the driver by the JVM will cause Compressed class space OOM

In order to fix these two problems, this PR has the following changes
1. Remove the logic of BE downloading the Driver Jar to the local cache directory, and directly hand over the original path to Java's Classloader for processing
2. Treat the jar packages with the same name in the same path as one and cache them in the map to avoid repeated loading and cause Compressed class space OOM
yiguolei pushed a commit that referenced this pull request Feb 21, 2025
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
…e#46912)

Related PR: apache#45806

Problem Summary:

In the previous BE processing of JDBC Driver, the Driver jar will be downloaded to the local cache directory first, and the cached jar package will be provided to the JVM for loading. This will cause two problems
1. The jar package in the cache may fail to load due to duplication
2. Frequent repeated loading of the driver by the JVM will cause Compressed class space OOM

In order to fix these two problems, this PR has the following changes
1. Remove the logic of BE downloading the Driver Jar to the local cache directory, and directly hand over the original path to Java's Classloader for processing
2. Treat the jar packages with the same name in the same path as one and cache them in the map to avoid repeated loading and cause Compressed class space OOM
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Feb 25, 2025
…e#46912)

Related PR: apache#45806

Problem Summary:

In the previous BE processing of JDBC Driver, the Driver jar will be downloaded to the local cache directory first, and the cached jar package will be provided to the JVM for loading. This will cause two problems
1. The jar package in the cache may fail to load due to duplication
2. Frequent repeated loading of the driver by the JVM will cause Compressed class space OOM

In order to fix these two problems, this PR has the following changes
1. Remove the logic of BE downloading the Driver Jar to the local cache directory, and directly hand over the original path to Java's Classloader for processing
2. Treat the jar packages with the same name in the same path as one and cache them in the map to avoid repeated loading and cause Compressed class space OOM
@yiguolei yiguolei mentioned this pull request Mar 25, 2025
@gavinchou gavinchou mentioned this pull request Apr 23, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…e#46912)

Related PR: apache#45806

Problem Summary:

In the previous BE processing of JDBC Driver, the Driver jar will be downloaded to the local cache directory first, and the cached jar package will be provided to the JVM for loading. This will cause two problems
1. The jar package in the cache may fail to load due to duplication
2. Frequent repeated loading of the driver by the JVM will cause Compressed class space OOM

In order to fix these two problems, this PR has the following changes
1. Remove the logic of BE downloading the Driver Jar to the local cache directory, and directly hand over the original path to Java's Classloader for processing
2. Treat the jar packages with the same name in the same path as one and cache them in the map to avoid repeated loading and cause Compressed class space OOM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.9-merged dev/3.0.5-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants