Skip to content

Conversation

@edwardcapriolo
Copy link
Contributor

Description of PR

Linux container executor isn't portable to alpine linux due to code that gets the passwd info for local users

https://github.com/edwardcapriolo/hadoop/pull/new/YARN-11919

How was this patch tested?

Unit tests and direct calls to lce under specific conditions.

For code changes:

  • [ Y] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • [na ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • [na ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • [ na] If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

AI Tooling

AI was not used

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 13m 55s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
-1 ❌ mvninstall 1m 31s /branch-mvninstall-root.txt root in trunk failed.
+1 💚 compile 2m 9s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
-1 ❌ shadedclient 5m 24s branch has errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 32s the patch passed
+1 💚 compile 0m 54s the patch passed
+1 💚 cc 0m 54s the patch passed
+1 💚 golang 0m 54s the patch passed
+1 💚 javac 0m 54s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 0m 32s the patch passed
-1 ❌ shadedclient 1m 30s patch has errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 23m 45s /patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt hadoop-yarn-server-nodemanager in the patch failed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
48m 12s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/1/artifact/out/Dockerfile
GITHUB PR #8177
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux 366f9d3fe05d 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / b02531c
Default Java Red Hat, Inc.-1.8.0_472-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/1/testReport/
Max. process+thread count 177 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/1/console
versions git=2.43.7 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@cnauroth
Copy link
Contributor

I retriggered pre-submit now that we have the build fix in #8166 committed.

https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/2/

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edwardcapriolo , this is interesting. Thanks for reporting it. Do you know if this is because sysconf(_SC_GETPW_R_SIZE_MAX) doesn't give a defined max buffer size on Alpine? If so, then I guess every other OS we've tried in the past has it defined and we didn't know.

@edwardcapriolo
Copy link
Contributor Author

edwardcapriolo commented Jan 12, 2026

@cnauroth I am not much of the c expert. As you know it is not java-like, instead of stack traces.... segfault. I did try to get a debugger but building the code with the debugger and running GDB in a container got complicated. I am fast out of my depth.

My thought is that the math we are doing to create the buffer size isn't in accordance with the documentation I found. If you compare the two codes you can see why. It also could be be some other issue, possibly container is alpine but my host is fedora. I don't know what happens to the sysconf call in that case.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 14m 19s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 3s trunk passed
+1 💚 compile 1m 13s trunk passed with JDK Red Hat, Inc.-21.0.9+10-LTS
+1 💚 compile 1m 17s trunk passed with JDK Red Hat, Inc.-17.0.17+10-LTS
+1 💚 mvnsite 0m 54s trunk passed
+1 💚 shadedclient 41m 49s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 34s the patch passed
+1 💚 compile 0m 58s the patch passed with JDK Red Hat, Inc.-21.0.9+10-LTS
+1 💚 cc 0m 58s the patch passed
+1 💚 golang 0m 58s the patch passed
+1 💚 javac 0m 58s the patch passed
+1 💚 compile 0m 58s the patch passed with JDK Red Hat, Inc.-17.0.17+10-LTS
+1 💚 cc 0m 58s the patch passed
+1 💚 golang 0m 58s the patch passed
+1 💚 javac 0m 58s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 0m 37s the patch passed
+1 💚 shadedclient 15m 22s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 23m 2s /patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt hadoop-yarn-server-nodemanager in the patch failed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
99m 41s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/2/artifact/out/Dockerfile
GITHUB PR #8177
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux 1c0b66cf46ae 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / b02531c
Default Java Red Hat, Inc.-17.0.17+10-LTS
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-21.0.9.0.10-1.el8.x86_64:Red Hat, Inc.-21.0.9+10-LTS /usr/lib/jvm/java-17-openjdk-17.0.17.0.10-1.el8.x86_64:Red Hat, Inc.-17.0.17+10-LTS
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/2/testReport/
Max. process+thread count 633 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/2/console
versions git=2.43.7 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@edwardcapriolo
Copy link
Contributor Author

Is there any way to get more details out of this:

[INFO] --- hadoops:3.5.0-SNAPSHOT:cmake-test (test-container-executor) @ hadoop-yarn-server-nodemanager ---
[INFO] -------------------------------------------------------
[INFO]  C M A K E B U I L D E R    T E S T
[INFO] -------------------------------------------------------
[INFO] test-container-executor: running /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-8177/rockylinux-8/src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/target/usr/local/bin/test-container-executor
[INFO] with extra environment variables {}
[INFO] STATUS: ERROR CODE 1 after 18 millisecond(s).
[INFO] -------------------------------------------------------
[INFO] ------------------------------------------------------------------------

@edwardcapriolo
Copy link
Contributor Author

Ok I have it sorted:

https://man7.org/linux/man-pages/man3/getpwnam.3p.html

   Note that sysconf(_SC_GETPW_R_SIZE_MAX) may return -1 if there is
   no hard limit on the size of the buffer needed to store all the
   groups returned. This example shows how an application can
   allocate a buffer of sufficient size to work with getpwnam_r().

Both examples are wrong. You need to detect error and grow the buffer. Yikes

    while ((e = getpwnam_r("someuser", &result, buffer, len, &resultp))
               == ERANGE)

@edwardcapriolo
Copy link
Contributor Author

Also the way the code is written. Im not sure it is correct. I think it only works via a quirk.

src/main/native/container-executor/impl/container-executor.c

//struct to store the user details
struct passwd *user_detail = NULL;

The documentation is not super clear but the passwd struct has pointers to the buffer, and some docs seem to indicate the points might be reused/cleared in later calls.

@edwardcapriolo
Copy link
Contributor Author

edwardcapriolo commented Jan 13, 2026

@cnauroth

Interestingly what I believe i have found is that your must deep copy these objects. What I observe is repeated calls to the method changes the objects. Something about the buffer sizing of the old code wasn't hitting the edge case.

For some confirmation I asked google: multipe callsto getpwnam_r to create array of users.
It provided code that was doing a deep copy of the objects, and this explanation.

Key Concepts

Reentrancy: getpwnam_r() is thread-safe because the caller provides the memory (buffer) where the data is stored.
Persistent Storage: To build an array, you cannot just store pointers to the temporary pwd struct used inside the loop. You must make a deep copy of the data (the UserEntry struct and the strings within it) into a new, stable memory location that persists across loop iterations.
Error Handling: Check the return value (status) and the result pointer to differentiate between an error condition and a "user not found" scenario.
Memory Management: Dynamic memory allocation (malloc, strdup) requires explicit deallocation using free to prevent memory leaks

This is exactly the conclusion I had come to that we need to deep copy this object because the second call to the method alters the state of the first struct. I should have a fix in the next day or so.

@edwardcapriolo
Copy link
Contributor Author

edwardcapriolo commented Jan 14, 2026

here is my final summary of the issue. IMHO The code as it is in master amazingly works only in limited contexts. Here is why:

hile ((s = getpwnam_r(user, &pwd, buf, bufsize, &result)) == ERANGE){

This is the proper way to use this method. It may return ERANGE which means the buffer is not big enough and you need to keep trying.

Next the big problem: the passwd stuct has pointers to buffers that can be recyled by other calls to getpwnam_r. So the global object could be corrupted by further calls.

 //struct to store the user details
-struct passwd *user_detail = NULL;
+struct serialized_passwd *user_detail = NULL;

This was effectively the root error I originally observed when I tried to take this to alpine. The implementation of passwd is sufficiently different that it exposed the problem above.

Please review @cnauroth and other people who are skilled with c/c++. Thanks.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 43s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 15s trunk passed
+1 💚 compile 1m 11s trunk passed with JDK Red Hat, Inc.-21.0.9+10-LTS
+1 💚 compile 1m 15s trunk passed with JDK Red Hat, Inc.-17.0.17+10-LTS
+1 💚 mvnsite 0m 54s trunk passed
+1 💚 shadedclient 41m 58s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 35s the patch passed
+1 💚 compile 0m 58s the patch passed with JDK Red Hat, Inc.-21.0.9+10-LTS
+1 💚 cc 0m 58s the patch passed
+1 💚 golang 0m 58s the patch passed
+1 💚 javac 0m 58s the patch passed
+1 💚 compile 0m 57s the patch passed with JDK Red Hat, Inc.-17.0.17+10-LTS
+1 💚 cc 0m 57s the patch passed
+1 💚 golang 0m 57s the patch passed
+1 💚 javac 0m 57s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 shadedclient 15m 34s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 23m 6s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
86m 13s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/3/artifact/out/Dockerfile
GITHUB PR #8177
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux 2f68d2249668 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / bdb5d65
Default Java Red Hat, Inc.-17.0.17+10-LTS
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-21.0.9.0.10-1.el8.x86_64:Red Hat, Inc.-21.0.9+10-LTS /usr/lib/jvm/java-17-openjdk-17.0.17.0.10-1.el8.x86_64:Red Hat, Inc.-17.0.17+10-LTS
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/3/testReport/
Max. process+thread count 630 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/3/console
versions git=2.43.7 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 42s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 24m 22s trunk passed
+1 💚 compile 1m 13s trunk passed with JDK Red Hat, Inc.-21.0.9+10-LTS
+1 💚 compile 1m 13s trunk passed with JDK Red Hat, Inc.-17.0.17+10-LTS
+1 💚 mvnsite 0m 56s trunk passed
+1 💚 shadedclient 43m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 34s the patch passed
+1 💚 compile 0m 56s the patch passed with JDK Red Hat, Inc.-21.0.9+10-LTS
+1 💚 cc 0m 56s the patch passed
+1 💚 golang 0m 56s the patch passed
+1 💚 javac 0m 56s the patch passed
+1 💚 compile 0m 59s the patch passed with JDK Red Hat, Inc.-17.0.17+10-LTS
+1 💚 cc 0m 59s the patch passed
+1 💚 golang 0m 59s the patch passed
+1 💚 javac 0m 59s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 shadedclient 15m 24s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 23m 12s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
87m 23s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/artifact/out/Dockerfile
GITHUB PR #8177
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux 7c5632d58f27 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f730460
Default Java Red Hat, Inc.-17.0.17+10-LTS
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-21.0.9.0.10-1.el8.x86_64:Red Hat, Inc.-21.0.9+10-LTS /usr/lib/jvm/java-17-openjdk-17.0.17.0.10-1.el8.x86_64:Red Hat, Inc.-17.0.17+10-LTS
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/testReport/
Max. process+thread count 722 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/console
versions git=2.43.7 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 12m 59s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 14m 55s trunk passed
+1 💚 compile 0m 45s trunk passed
+1 💚 mvnsite 0m 27s trunk passed
+1 💚 shadedclient 30m 1s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 23s the patch passed
+1 💚 compile 0m 40s the patch passed
+1 💚 cc 0m 40s the patch passed
+1 💚 golang 0m 40s the patch passed
+1 💚 javac 0m 40s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 0m 23s the patch passed
+1 💚 shadedclient 14m 19s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 22m 35s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 19s The patch does not generate ASF License warnings.
82m 47s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/artifact/out/Dockerfile
GITHUB PR #8177
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux a79823720ea9 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f730460
Default Java Debian-17.0.17+10-Debian-1deb11u1
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/testReport/
Max. process+thread count 668 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/console
versions git=2.30.2 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 44s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 14m 34s trunk passed
+1 💚 compile 0m 50s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 51s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 mvnsite 0m 27s trunk passed
+1 💚 shadedclient 30m 56s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 26s the patch passed
+1 💚 compile 0m 46s the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 cc 0m 46s the patch passed
+1 💚 golang 0m 46s the patch passed
+1 💚 javac 0m 46s the patch passed
+1 💚 compile 0m 47s the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 cc 0m 47s the patch passed
+1 💚 golang 0m 47s the patch passed
+1 💚 javac 0m 47s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 0m 25s the patch passed
+1 💚 shadedclient 14m 17s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 21m 49s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 20s The patch does not generate ASF License warnings.
71m 38s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/artifact/out/Dockerfile
GITHUB PR #8177
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux 17a3fba2aa7b 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f730460
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/testReport/
Max. process+thread count 619 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8177/4/console
versions git=2.25.1 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@edwardcapriolo
Copy link
Contributor Author

@cnauroth A reminder on this one. I am digging though the c code quite intensely, and I believe I have found another c pointer level issue. Because this code is the gatekeeper of secure hadoop we want do not want dangerous undefined behavior running about.

@edwardcapriolo
Copy link
Contributor Author

@ericbadger @brumi1024 May you look unfortunately I have found you through git blame :)


//struct to store the user details
struct passwd *user_detail = NULL;
struct serialized_passwd *user_detail = NULL;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The passwd struct contains pointers to buffers. The buffers are have already been freed and the behavor is undefined. This struct is created to insulate us from the passwd struct and give us better control of the memory.

}

//function to make a deep clone of passwd to serialized_passwd
void deep_copy_passwd(const struct passwd *src, struct serialized_passwd *dest){
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We duplicate the object using str_dup for all strings. int copy constructor is fine.

dest->pw_gid = src->pw_gid;
}

void free_serialized_passwd(struct serialized_passwd * passwd){
Copy link
Contributor Author

@edwardcapriolo edwardcapriolo Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we create dynamic memory for this struct we free it. We give users a simple clean destructor like method.

if (buf == NULL) {
exit(EXIT_FAILURE);
}
while ((s = getpwnam_r(user, &pwd, buf, bufsize, &result)) == ERANGE){
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The large flaw in the exsting code method call return ERANGE when buffer to small. This is the excepted recipe to continually resize it. In practice I never observed the loop run more than once.

//extern struct passwd *user_detail;
extern struct section executor_cfg;

struct serialized_passwd {
Copy link
Contributor Author

@edwardcapriolo edwardcapriolo Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we put only the things we need limiting the domain and size of the structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants