-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Issue 8751] Update Dockerfile for Pulsar and Dashboard to Create and Use pulsar User (nonroot user) #8796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
37f5787 to
8fa7d02
Compare
|
I pushed updates to use the user's id instead of name, and I pushed an update to properly chown the added directories in the |
|
This PR would also fix issue #7210. |
|
I pushed a commit to clean up my initial commit and to try to make the tests pass. Note: I removed the |
|
@addisonj ^^ |
|
I haven't yet had the time to dig into the tests to understand why they're failing. I think I'll be able to look into it later this week--although, I am new to the project, so any direction as to where things might be failing would be greatly appreciated. |
docker/pulsar/Dockerfile
Outdated
| USER pulsar | ||
|
|
||
| # Directories will have correct permission because we switched to the pulsar user | ||
| RUN mkdir /pulsar/conf /pulsar/data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaeljmarshall when the docker image is used in Kubernetes, the helm chart will mount the disks to /pulsar/data. Does it change the permission?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernetes will not change the file system permissions by default. If the volume is owned by root, the container will also view the mounted volume as owned by root, which would be a problem for the container produced from this docker image.
In Linux, the mounted volume will need to be writable by user 1000. I'm not certain how Docker users map to host users in other operating systems.
Here is the Kubernetes documentation on permissions and pods. It looks like there has been active development in the mounted volume permission domain.
I'm not sure how the pulsar project handles this type of change. Given that administrators of pulsar clusters should have access to the nodes running pulsar, it seems reasonable to me that those administrators could change the ownership of the relevant directories before upgrading to the next version of pulsar. The current containers would have root permissions, and would be able to write to directories/files owned by user 1000, so a rolling upgrade could still be achieved.
Is there any precedent for this type of change? I would really like to get this into the 2.8.0 release, so that we can increase the security of pulsar containers and I can stop maintaining my own fork of the project. However, I also recognize the importance of carefully releasing major changes to a project.
As I mentioned in my initial PR description, it might be relevant to provide migration documentation, but I'm not sure where to put it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sijie - after thinking about this, I think it might be helpful, and possibly necessary, to make a corresponding update to the helm chart to allow end users to choose how the container is run in kubernetes. Because docker allows an end user to choose the user that a container runs with, it is trivial for end users to run the container as root until they're able to update the host file system to have the appropriate permissions for the /puslar/conf and /pulsar/data directories. Let me know what you think.
That's probably related to flaky test environment. Can you try to rebase your branch to latest master to include some of recent fixes around tests? |
a2df5ea to
f4019a1
Compare
|
@sijie - I rebased and pushed, but it looks like the tests still failed. I'll try to take a closer look at the tests tomorrow. |
|
Of the 14 failed checks, there are 3 unique errors being logged. The first two seem like symptoms of an earlier failures. Here are the stages being run, and the errors output:
@sijie - any chance that the |
|
@michaeljmarshall It seems that all integration tests failed. It seems that it is related to this change. I am trying it locally. |
|
I had missed a |
|
After digging into this morning's test failures, I can see that the tests failed because we're using I am slightly concerned that this approach requires configuration in multiple places (each of the pulsar component |
|
This latest commit includes a fix to make the pulsar python client available to the I think we should convert the pulsar function tests to run as the These test failures (related to problems running as root vs pulsar) raise an important question about backwards compatibility and the published docker images. I think it adds value to let the container run as either the root user or the pulsar user to allow for end users to transition to the non-root version. As such, this commit installs the pulsar python client so that it's available to the root and pulsar. However, I don't think it is necessary to run tests as the root user because it is unlikely that the root user will have any other problems. Based on a limited understanding of the tests, the pulsar functions are likely one of the only processes that rely on python, making it the only process that use the pip dependencies. It'd be great to have some perspective from people more familiar with the project and any other edge cases in different deployments. |
91d8a2e to
077a95e
Compare
|
Rebased my branch with master. |
|
Based on a recent blogpost, https://github.com/hexops/dockerfile#do-not-use-a-uid-below-10000, I decided to change the pulsar user id to 10000 and the group id to 10001 in this most recent commit. Further, I added some documentation for the getting started with docker. |
|
@sijie - on this previous commit, it looks like the tests that failed only failed due to timeouts. On the commit before the latest one, the only test failure was from a maven issue that was unrelated to my change. At this point, it seems to me that the underlying change is valid, but I'm wondering if there is any chance that I'm running into flaky tests. If you think it's likely a problem with my changes, is there anyone who can help me troubleshoot these test failures? I'm happy to keep digging, but I think it might be more productive if I can get some guidance. Thanks! |
|
@michaeljmarshall the integration tests passed. so that means your change is good. Other failures are flaky tests. I will rebase your branch to latest master (as there are some fixes in master to improve tests). |
8830159 to
8c95f6b
Compare
|
@sijie - thanks for the insight. I just rebased and pushed my branch. |
|
/pulsarbot run-failure-checks |
|
After discussion in #8242 and some additional research, I found a better way to meet all requirements to build this image to run with a non root user. I have the solution half implemented and will need to do some testing to make sure it looks as I expect before I add another commit. |
|
Rebased to see if any of the recent test improvements help make this PR pass tests. |
|
@michaeljmarshall Thank you for your contribution! |
|
@michaeljmarshall after this commit I am no more able to build the Docker image locally Are you able to build it locally ? |
|
The error is not due to this patch, I reproduce it on previous commits. |
|
Today I started running some tests for 2.8.0-SNAPSHOT images. There's a permission issue in pulsar-proxy since the default configuration uses port 80 and now when the default user is "pulsar", it cannot bind to port 80. This is the error log in pulsar-proxy-0 pod: The problem goes away after changing the port to 8080 in values.yaml: @michaeljmarshall @sijie I wonder how to address this breaking change in pulsar-helm-chart? |
|
@lhotari - good catch. I think the easiest way to prevent a breaking change in the helm chart is to default to using the Essentially, we need to default a pulsar proxy pod's security context to the following: securityContext:
runAsUser: 0
runAsGroup: 0I believe all other pulsar containers should still be able to run as the non root user. (I am assuming that we want to prevent breaking changes here as the 2.8.0 release is only a minor version bump.) |
|
@michaeljmarshall @sijie I reported the issue as apache/pulsar-helm-chart#110 in pulsar-helm-chart so that it gets tracked. |
… Use pulsar User (nonroot user) (#8796) Fixes #8751 Pulsar does not need to run as the root user. This PR updates the pulsar and the pulsar dashboard images to make them run as a new `pulsar` user (user ~1000~ 10000 and group 10001). This change increases the security of pulsar images. Update two `Dockerfile`s to create a pulsar user, chown the appropriate directories, and then use that user by default. - [ ] Make sure that the change passes the CI checks. I manually verified that the docker images run with the correct user and file permissions. As this is my first commit, I'm not familiar with pulsar testing. Are there tests that run against the produced docker images? If so, then there is likely no further testing needed. (cherry picked from commit 4264a67)
… Use pulsar User (nonroot user) (apache#8796) Fixes apache#8751 Pulsar does not need to run as the root user. This PR updates the pulsar and the pulsar dashboard images to make them run as a new `pulsar` user (user ~1000~ 10000 and group 10001). This change increases the security of pulsar images. Update two `Dockerfile`s to create a pulsar user, chown the appropriate directories, and then use that user by default. - [ ] Make sure that the change passes the CI checks. I manually verified that the docker images run with the correct user and file permissions. As this is my first commit, I'm not familiar with pulsar testing. Are there tests that run against the produced docker images? If so, then there is likely no further testing needed. (cherry picked from commit 4264a67)
…eate and Use pulsar User (nonroot user) (apache#8796)" This reverts commit 3b0f7d7.
… Use pulsar User (nonroot user) (apache#8796) Fixes apache#8751 Pulsar does not need to run as the root user. This PR updates the pulsar and the pulsar dashboard images to make them run as a new `pulsar` user (user ~1000~ 10000 and group 10001). This change increases the security of pulsar images. Update two `Dockerfile`s to create a pulsar user, chown the appropriate directories, and then use that user by default. - [ ] Make sure that the change passes the CI checks. I manually verified that the docker images run with the correct user and file permissions. As this is my first commit, I'm not familiar with pulsar testing. Are there tests that run against the produced docker images? If so, then there is likely no further testing needed. (cherry picked from commit 4264a67)
…eate and Use pulsar User (nonroot user) (apache#8796)" This reverts commit 4264a67.
#10861) This reverts commit 4264a67. ### Motivation The change #8796 has broken the Pulsar Functions running on Kubernetes. The Pulsar Functions Kubernetes runtime generates a secret and mounts it using mode `256`. That means the secret is only able to read by the user. The StatefulSet created by Kubernetes runtime mounts the secrets under the `root` user. Hence only the root user is able to read the secret. This results in any functions submitted will fail to read the authentication information. Because all the Kubernetes resources generated by the Kubernetes runtime are hardcoded. There is no easy way to change the security context for the function statefulsets. Let's revert this change for 2.8.0, until we can address the issues in the Kubernetes runtime.
#10861) This reverts commit 4264a67. ### Motivation The change #8796 has broken the Pulsar Functions running on Kubernetes. The Pulsar Functions Kubernetes runtime generates a secret and mounts it using mode `256`. That means the secret is only able to read by the user. The StatefulSet created by Kubernetes runtime mounts the secrets under the `root` user. Hence only the root user is able to read the secret. This results in any functions submitted will fail to read the authentication information. Because all the Kubernetes resources generated by the Kubernetes runtime are hardcoded. There is no easy way to change the security context for the function statefulsets. Let's revert this change for 2.8.0, until we can address the issues in the Kubernetes runtime. (cherry picked from commit 4f556a2)
… Use pulsar User (nonroot user) (apache#8796) Fixes apache#8751 Pulsar does not need to run as the root user. This PR updates the pulsar and the pulsar dashboard images to make them run as a new `pulsar` user (user ~1000~ 10000 and group 10001). This change increases the security of pulsar images. Update two `Dockerfile`s to create a pulsar user, chown the appropriate directories, and then use that user by default. - [ ] Make sure that the change passes the CI checks. I manually verified that the docker images run with the correct user and file permissions. As this is my first commit, I'm not familiar with pulsar testing. Are there tests that run against the produced docker images? If so, then there is likely no further testing needed. (cherry picked from commit 4264a67) (cherry picked from commit bf00805)
… Use pulsar User (nonroot user) (apache#8796) Fixes apache#8751 Pulsar does not need to run as the root user. This PR updates the pulsar and the pulsar dashboard images to make them run as a new `pulsar` user (user ~1000~ 10000 and group 10001). This change increases the security of pulsar images. Update two `Dockerfile`s to create a pulsar user, chown the appropriate directories, and then use that user by default. - [ ] Make sure that the change passes the CI checks. I manually verified that the docker images run with the correct user and file permissions. As this is my first commit, I'm not familiar with pulsar testing. Are there tests that run against the produced docker images? If so, then there is likely no further testing needed. (cherry picked from commit 4264a67) (cherry picked from commit bf00805)
apache#10861) This reverts commit 4264a67. ### Motivation The change apache#8796 has broken the Pulsar Functions running on Kubernetes. The Pulsar Functions Kubernetes runtime generates a secret and mounts it using mode `256`. That means the secret is only able to read by the user. The StatefulSet created by Kubernetes runtime mounts the secrets under the `root` user. Hence only the root user is able to read the secret. This results in any functions submitted will fail to read the authentication information. Because all the Kubernetes resources generated by the Kubernetes runtime are hardcoded. There is no easy way to change the security context for the function statefulsets. Let's revert this change for 2.8.0, until we can address the issues in the Kubernetes runtime.
… Use pulsar User (nonroot user) (apache#8796) Fixes apache#8751 Pulsar does not need to run as the root user. This PR updates the pulsar and the pulsar dashboard images to make them run as a new `pulsar` user (user ~1000~ 10000 and group 10001). This change increases the security of pulsar images. Update two `Dockerfile`s to create a pulsar user, chown the appropriate directories, and then use that user by default. - [ ] Make sure that the change passes the CI checks. I manually verified that the docker images run with the correct user and file permissions. As this is my first commit, I'm not familiar with pulsar testing. Are there tests that run against the produced docker images? If so, then there is likely no further testing needed. (cherry picked from commit 4264a67) (cherry picked from commit bf00805) (cherry picked from commit 1b44c32)
apache#10861) This reverts commit 4264a67. ### Motivation The change apache#8796 has broken the Pulsar Functions running on Kubernetes. The Pulsar Functions Kubernetes runtime generates a secret and mounts it using mode `256`. That means the secret is only able to read by the user. The StatefulSet created by Kubernetes runtime mounts the secrets under the `root` user. Hence only the root user is able to read the secret. This results in any functions submitted will fail to read the authentication information. Because all the Kubernetes resources generated by the Kubernetes runtime are hardcoded. There is no easy way to change the security context for the function statefulsets. Let's revert this change for 2.8.0, until we can address the issues in the Kubernetes runtime.
…… (#10861) This reverts commit b656cdd. ### Motivation The change apache/pulsar#8796 has broken the Pulsar Functions running on Kubernetes. The Pulsar Functions Kubernetes runtime generates a secret and mounts it using mode `256`. That means the secret is only able to read by the user. The StatefulSet created by Kubernetes runtime mounts the secrets under the `root` user. Hence only the root user is able to read the secret. This results in any functions submitted will fail to read the authentication information. Because all the Kubernetes resources generated by the Kubernetes runtime are hardcoded. There is no easy way to change the security context for the function statefulsets. Let's revert this change for 2.8.0, until we can address the issues in the Kubernetes runtime.
Fixes #8751
Motivation
Pulsar does not need to run as the root user. This PR updates the pulsar and the pulsar dashboard images to make them run as a new
pulsaruser (user100010000 and group 10001). This change increases the security of pulsar images.Modifications
Update two
Dockerfiles to create a pulsar user, chown the appropriate directories, and then use that user by default.Verifying this change
I manually verified that the docker images run with the correct user and file permissions. As this is my first commit, I'm not familiar with pulsar testing. Are there tests that run against the produced docker images? If so, then there is likely no further testing needed.
Does this pull request potentially affect one of the following parts:
If
yeswas chosen, please highlight the changesIn order to deploy this new docker container, users will need to update the
journalandledgersdirectories on the host to be owned by user100010000and group100010001.Documentation
Documentation will be required for this change because users will need to set up their hosts differently to properly give permission to user
100010000 and group100010001 in order to allow the container to write to the appropriate host volumes. However, I'd like to see if this approach is accepted before adding documentation.