Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Oct 15, 2024

What changes were proposed in this pull request?

This PR simplifies dependency management in YARN module by pruning unnecessary test scope dependency which pulls from the vanilla Hadoop client.

Why are the changes needed?

Since 3.2 (SPARK-33212), Spark moved from the vanilla Hadoop3 client to the shaded Hadoop3 client, significantly simplifying dependency management, some hack rules of dependency to address the odd issues can be removed to simplify the Maven/SBT configuration files now.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • pass SBT test: build/sbt -Pyarn yarn/test
  • pass Maven test: build/mvn -Pyarn -pl :spark-yarn_2.13 clean install -DskipTests -am && build/mvn -Pyarn -pl :spark-yarn_2.13 test
  • verified no affection on runtime deps: dev/test-dependencies.sh

Was this patch authored or co-authored using generative AI tooling?

No.

scope are not transitive.-->
<dependency>
<groupId>${hive.group}</groupId>
<artifactId>hive-exec</artifactId>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically, HiveDelegationTokenProvider lived in yarn module and pulled the hive deps, which is unnecessary now because the code moved to the hive module


<!--
Jersey 1 dependencies only required for YARN integration testing. Creating a YARN cluster
in the JVM requires starting a Jersey 1-based web application.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hadoop shaded client completely cut off the jersey 1 deps

@pan3793
Copy link
Member Author

pan3793 commented Oct 15, 2024

cc @yaooqinn @LuciferYang

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @pan3793 .
Merged to master for Apache Spark 4.0.0.

senthh pushed a commit to acceldata-io/spark3 that referenced this pull request Dec 26, 2025
…module

### What changes were proposed in this pull request?

This PR simplifies dependency management in YARN module by pruning unnecessary test scope dependency which pulls from the vanilla Hadoop client.

### Why are the changes needed?

Since 3.2 (SPARK-33212), Spark moved from the vanilla Hadoop3 client to the shaded Hadoop3 client, significantly simplifying dependency management, some hack rules of dependency to address the odd issues can be removed to simplify the Maven/SBT configuration files now.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

- pass SBT test: `build/sbt -Pyarn yarn/test`
- pass Maven test: `build/mvn -Pyarn -pl :spark-yarn_2.13 clean install -DskipTests -am && build/mvn -Pyarn -pl :spark-yarn_2.13 test`
- verified no affection on runtime deps: `dev/test-dependencies.sh`

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#48468 from pan3793/SPARK-49969.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

(cherry picked from commit 856cfe7)
basapuram-kumar pushed a commit to acceldata-io/spark3 that referenced this pull request Jan 19, 2026
* ODP-5743|[SPARK-48231][BUILD] Remove unused CodeHaus Jackson dependencies

### What changes were proposed in this pull request?

CodeHaus Jackson dependencies were pulled from Hive, while in apache/hive#4564 (Hive 2.3.10), it migrated to Jackson 2.x, so we can remove them from Spark now.

### Why are the changes needed?

Remove unused and vulnerable dependencies.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#46521 from pan3793/SPARK-48231.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: yangjie01 <yangjie01@baidu.com>

(cherry picked from commit 7916799)

* ODP-5743|[SPARK-48231][BUILD] Remove unused CodeHaus Jackson dependencies

### What changes were proposed in this pull request?

CodeHaus Jackson dependencies were pulled from Hive, while in apache/hive#4564 (Hive 2.3.10), it migrated to Jackson 2.x, so we can remove them from Spark now.

### Why are the changes needed?

Remove unused and vulnerable dependencies.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#46521 from pan3793/SPARK-48231.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: yangjie01 <yangjie01@baidu.com>

(cherry picked from commit 7916799)

* ODP-5743|[SPARK-49969][BUILD] Simplify dependency management in YARN module

### What changes were proposed in this pull request?

This PR simplifies dependency management in YARN module by pruning unnecessary test scope dependency which pulls from the vanilla Hadoop client.

### Why are the changes needed?

Since 3.2 (SPARK-33212), Spark moved from the vanilla Hadoop3 client to the shaded Hadoop3 client, significantly simplifying dependency management, some hack rules of dependency to address the odd issues can be removed to simplify the Maven/SBT configuration files now.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

- pass SBT test: `build/sbt -Pyarn yarn/test`
- pass Maven test: `build/mvn -Pyarn -pl :spark-yarn_2.13 clean install -DskipTests -am && build/mvn -Pyarn -pl :spark-yarn_2.13 test`
- verified no affection on runtime deps: `dev/test-dependencies.sh`

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#48468 from pan3793/SPARK-49969.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

(cherry picked from commit 856cfe7)

* ODP-5743 - CVE - Fixing CVE-2024-47561 and CVE-2021-22569

---------

Co-authored-by: Cheng Pan <chengpan@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants