helm: Add serviceAccounts, rbac, and small fixes#13747
helm: Add serviceAccounts, rbac, and small fixes#13747abhishekagarwal87 merged 9 commits intoapache:masterfrom
Conversation
4c8e2dc to
544d100
Compare
544d100 to
33818bf
Compare
|
can you elaborate a bit more on "Enabling per-service serviceAccounts allows for finer grained RBAC"? Maybe take us through an example use-case that you want to implement on your own cluster. |
|
Merged upstream master into this branch |
Absolutely. Assuming you take the above approach and druid is now in its own namespace you have another issue. Different druid services require different levels of permissions. For example the By merging this PR we get per-service service accounts which allow us to apply Kubernetes native RBAC and additional external privileges (like iRSA) on a per-service level so that we don't have to give unnecessary privileges to services that don't require them. @abhishekagarwal87 does that make sense? Let me know if you have any questions or disagree with any of the above. |
|
Thank you. I see that by default, helm charts are now going to create service accounts per service. so can we keep the default off in top-level |
Sorry, I'm not sure what you mean here? Keep the default for what off? If you're talking about keeping the default for creating service accounts off then I would strongly recommend against this. Having a per-app service account is a much better security posture. It is possible it would affect existing helm chart deployments since users have likely had to modify the default service account outside of the helm chart, so maybe instead of leaving it off we bump the minor version and create an upgrade warning/note? |
| {{- toYaml . | nindent 8 }} | ||
| {{- end }} | ||
| spec: | ||
| {{- if .Values.broker.serviceAccount.create }} |
There was a problem hiding this comment.
This blocks external creation of the serviceAccount. Most other charts allow you to bring your own serviceAccount which may also make reusing the same one for all druid service types easier.
I am wondering if we should default serviceAccount.name to unset and if create is true it uses what the default would be. Then if serviceAccount.name is set and serviceAccount.create is false we can still set the value here.
The charts I am used to using have a template like:
{{/*
Create the name of the forwarder service account to use
*/}}
{{- define "fluentd.forwarder.serviceAccountName" -}}
{{- if .Values.forwarder.serviceAccount.create -}}
{{ default (printf "%s-forwarder" (include "common.names.fullname" .)) .Values.forwarder.serviceAccount.name }}
{{- else -}}
{{ default "default" .Values.forwarder.serviceAccount.name }}
{{- end -}}
{{- end -}}
and then always set the serviceAccountName with the output of the template.
There was a problem hiding this comment.
I see what you're saying. I have a bit of an opposite experience from you in terms of what other public widely used charts offer which is why I went this way but I'm not married to it. I can create a helper definition which does the above sort of style.
| @@ -0,0 +1,21 @@ | |||
| {{- if .Values.rbac.create }} | |||
There was a problem hiding this comment.
Should this also be checking .Values.broker.enabled? Feels like disabling the broker should also disable these resources from being created.
| @@ -0,0 +1,20 @@ | |||
| {{- if .Values.rbac.create }} | |||
There was a problem hiding this comment.
Same question about checking enabled for the druid service, eg: .Values.broker.enabled
| @@ -0,0 +1,20 @@ | |||
| {{- if .Values.broker.serviceAccount.create }} | |||
There was a problem hiding this comment.
Same question about checking enabled for the druid service, eg: .Values.broker.enabled
| - apiGroups: | ||
| - "" | ||
| resources: | ||
| - pods | ||
| - configmaps | ||
| verbs: | ||
| - '*' |
There was a problem hiding this comment.
Unclear on why the broker needs access to delete Pods, or really any of these. Are these permissions only needed by the Overlord?
There was a problem hiding this comment.
I found that when this role did not exist none of the pods could start. They seem to all manage labels when using the kubernetes extension. Unclear if they will remove when not needed the same as they add when needed. I can test that and narrow down permission sets for each service.
There was a problem hiding this comment.
An update:
- Just an FYI that this is the role settings recommended in the official druid documentation for these extensions.
- Attempting to convert my dev cluster over to this version of the helm chart has been met with some very frustrating complications. For some reason, which as far as I can tell has nothing to do with service accounts, I am not getting the
2023-02-09T21:56:09,734 INFO [k8s-task-runner-3] org.apache.druid.indexing.overlord.TaskQueue - Received SUCCESS status for task: <some_task>response after a job when running with a cluster deployed via this version of the helm chart. - This means a task of
{"type":"noop"}is spawning a k8s job, executing successfully, the job has a zero return code and completes successfully, but the job status is reported asFAILEDin the UI. - The logs for that, in debug mode, look like this:
2023-02-09T21:55:51,559 DEBUG [k8s-task-runner-25] org.apache.http.wire - http-outgoing-22 >> "2023-02-09T21:39:13,653 DEBUG [task-runner-0-priority-0] org.apache.druid.indexing.common.actions.RemoteTaskActionClient - Performing action for task[noop_2023-02-09T21:39:02.008Z_936917b3-a0af-4656-a1e1-9e77ced24d92]: UpdateStatusAction{status=successful}[\n]"
2023-02-09T21:55:51,560 DEBUG [k8s-task-runner-25] org.apache.http.wire - http-outgoing-22 >> "2023-02-09T21:39:13,663 DEBUG [task-runner-0-priority-0] org.apache.druid.indexing.common.actions.RemoteTaskActionClient - Performing action for task[noop_2023-02-09T21:39:02.008Z_936917b3-a0af-4656-a1e1-9e77ced24d92]: UpdateLocationAction{taskLocation=TaskLocation{host='null', port=-1, tlsPort=-1}}[\n]"
2023-02-09T21:55:51,560 DEBUG [k8s-task-runner-25] org.apache.http.wire - http-outgoing-22 >> "2023-02-09T21:39:13,671 DEBUG [task-runner-0-priority-0] org.apache.druid.indexing.overlord.TaskRunnerUtils - Task [noop_2023-02-09T21:39:02.008Z_936917b3-a0af-4656-a1e1-9e77ced24d92] status changed to [SUCCESS].[\n]"
2023-02-09T21:55:51,560 DEBUG [k8s-task-runner-25] org.apache.http.wire - http-outgoing-22 >> " "id" : "noop_2023-02-09T21:39:02.008Z_936917b3-a0af-4656-a1e1-9e77ced24d92",[\n]"
2023-02-09T21:55:51,561 DEBUG [k8s-task-runner-25] org.apache.http.wire - http-outgoing-22 >> "2023-02-09T21:39:13,737 INFO [main] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Starting graceful shutdown of task[noop_2023-02-09T21:39:02.008Z_936917b3-a0af-4656-a1e1-9e77ced24d92].[\n]"
2023-02-09T21:55:51,561 DEBUG [k8s-task-runner-25] org.apache.http.wire - http-outgoing-22 >> "af-4656-a1e1-9e77ced24d92] status changed to [FAILED].[\n]"
The job status in the UI shows as:
{
"id": "noop_2023-02-09T21:39:02.008Z_936917b3-a0af-4656-a1e1-9e77ced24d92",
"groupId": "noop_2023-02-09T21:39:02.008Z_936917b3-a0af-4656-a1e1-9e77ced24d92",
"type": "noop",
"createdTime": "2023-02-09T21:39:02.014Z",
"queueInsertionTime": "1970-01-01T00:00:00.000Z",
"statusCode": "FAILED",
"status": "FAILED",
"runnerStatusCode": "WAITING",
"duration": -1,
"location": {
"host": "10.233.68.10",
"port": 8100,
"tlsPort": -1
},
"dataSource": "none",
"errorMsg": "Task failed %s: [ noop_2023-02-09T21:39:02.008Z_936917b3-a0af-4656-a1e1-9e77ced24d92, noop20230209t2..."
}
The logs from the actual job itself show a noop task with success:
2023-02-09T16:53:23,371 INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
"id" : "noop_2023-02-09T16:53:12.037Z_3c8e9f26-6ca4-4600-8723-572d2c8baf48",
"status" : "SUCCESS",
"duration" : 3887,
"errorMsg" : null,
"location" : {
"host" : null,
"port" : -1,
"tlsPort" : -1
}
}
I'm at a complete loss at this point so if anyone wanted to help I'd be all ears and welcome any type of paired-troubleshooting. I've gone as far as to give all the service accounts full cluster admin permissions but again I don't really think this has anything to do with service accounts at this point.
There was a problem hiding this comment.
I pushed up the changes requested except for the RBAC changes since I haven't actually been able to say I've had a successful deployment with these chart changes yet.
There was a problem hiding this comment.
Turns out this is a bug with the kubernetes overlord extension and kubernetes 1.25+
There was a problem hiding this comment.
Were you able to file an issue for it?
There was a problem hiding this comment.
Discovered this with the author of the kubernetes overlord extension @churromorales . I believe #13759 already fixes the issues and relates to #13749 but maybe @churromorales can comment further if another issue is required I will happily make one.
There was a problem hiding this comment.
@dampcake putting aside the potential for a new issue to be created I would ask that we merge the RBAC in its current state since it matches the official documentation. Once the bugs are fixed for K8s 1.25 I will test for the minimum viable role permissions and make a new PR to apply those via the helm chart as well as allow for concatenating or overriding the role permissions via the chart .Values.rbac map so people won't get blocked on changes required for rbac in the future.
If you're good with this let me know and I'll create the issue and it can be assigned to me.
| | `broker.replicaCount` | broker node replicas (deployment) | `1` | | ||
| | `broker.port` | port of broker component | `8082` | | ||
| | `broker.serviceAccount.create` | Create a service account for broker service | `true` | | ||
| | `broker.serviceAccount.name` | Service account name | `true` | |
There was a problem hiding this comment.
The documentation looks off to me. Sorry I didn't catch this before. Looks like it says the default for everything is true but that's not actually the case.
There was a problem hiding this comment.
No apologies, this is my fault. Sorry I missed it during pre-commit review. Fixed the documentation and merged upstream/master again.
| druid_processing_numThreads: 1 | ||
| # druid_monitoring_monitors: '["org.apache.druid.client.cache.CacheMonitor", "org.apache.druid.server.metrics.HistoricalMetricsMonitor", "org.apache.druid.server.metrics.QueryCountStatsMonitor"]' | ||
| # druid_segmentCache_locations: '[{"path":"/var/druid/segment-cache","maxSize":300000000000}]' | ||
| # druid_segmentCache_locations: '[{"path":"/opt/druid/var/druid/segment-cache","maxSize":300000000000}]' |
There was a problem hiding this comment.
why the change here?
There was a problem hiding this comment.
That is the bullet point in the description
Fix historical commented config druid_segmentCache_locations to match PVC mount path. In it's current form it breaks the historical pod when uncommented due to lack of permissions to create directory.
The container did not have permissions to create/write to /var/druid/segment-cache. /opt/druid/var/driuid/segment-cache is inside of the defined PVC mounted volume path and so the application would have the ability to write there as well it will be captured by the persistent volume and persist through pod restarts.
There was a problem hiding this comment.
that makes sense, thank you for the fix.
| | `router.port` | port of router component | `8888` | | ||
| | `router.serviceType` | service type for service | `ClusterIP` | | ||
| | `router.serviceAccount.create` | Create a service account for router service | `true` | | ||
| | `router.serviceAccount.name` | Service account name | `` | |
There was a problem hiding this comment.
what is the default service account name generated by helm? we should put that here instead of an empty string as default.
There was a problem hiding this comment.
The default name is template derived so there is no way to put a value here that is guaranteed to be accurate. It's a combination of the druid.<service>.fullname template which is based off the druid.fullname template and the service name. Most of which can be overridden.
{{- define "druid.fullname" -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if contains $name .Release.Name -}}
{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}
{{- end -}}
{{- define "druid.broker.fullname" -}}
{{ template "druid.fullname" . }}-{{ .Values.broker.name }}
{{- end -}}
{{/*
Create the name of the broker service account
*/}}
{{- define "druid.broker.serviceAccountName" -}}
{{- if .Values.broker.serviceAccount.create }}
{{- default (include "druid.broker.fullname" .) .Values.broker.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.broker.serviceAccount.name }}
{{- end }}
{{- end }}
There was a problem hiding this comment.
You can put Derived from the name of service. or something to that effect.
| name: {{ template "druid.historical.fullname" . }} | ||
| subjects: | ||
| - kind: ServiceAccount | ||
| name: {{ .Values.historical.serviceAccount.name }} |
There was a problem hiding this comment.
shouldn't this be druid.historical.serviceAccountName? The default value of this property is empty. is the chart generating the k8s configs correctly?
There was a problem hiding this comment.
Thank you for catching this. This was a left-over from before @dampcake recommended switching to the empty default value above in this PR and my testing had already had the .Values.*.serviceAccount.name values set from previous as well so I wasn't getting an issue here.
Fixed now.
|
@jwitko - can you fix the static checks failures? LGTM otherwise. |
…ceAccounts, roles, and roleBindings
Fix invalid use of the full podAnnotations path for middleManager
5c3471f to
6583615
Compare
I accidentally rebased again forgetting that you guys prefer a merge instead. trying to fix that now. |
|
@abhishekagarwal87 my apologies but I don't think I'm able to get the history back in a meaningful way to github. If you want me to close this PR and just open a new one then please let me know. Otherwise the PR looks to be in solid shape now with tests passing. |
|
@abhishekagarwal87 The static checks are failing for a spelling error that is outside the scope of this PR: They seem to have passed on all things related to this chart once I added the license everywhere. It seems to be complaining about this line but I'm not really sure what its even upset about |
|
thanks @jwitko. Thats an unrelated failure. |
Description
druid_segmentCache_locationsto match PVC mount path. In it's current form it breaks the historical pod when uncommented due to lack of permissions to create directory.druid-kubernetes-extensionsis enabled.ATTENTION
This PR should be merged after #13746 since it contains a helm chart version bump incremental to that one.
Goal
The goal of this PR is to bring the apache/druid helm chart up to modern standards and fix some small issues. Enabling per-service serviceAccounts allows for finer grained RBAC which is in general a better security posture. Without this all services are forced to use the default serviceAccount which creates issues when needing to annotate service accounts for things like AWS iRSA as well as makes your only method of controlled permissions using a dedicated namespace.
Release note
Update suggested segment-cache path, Allow for per-service serviceAccounts in druid helm chart and finer-grained RBAC, and add a default annotation to historical statefulset.
This PR has: