Skip to content

[BUG] spark backend flyteplugin does not update pod security context #2025

@akumor

Description

@akumor

Describe the bug

The SparkApplication object created in Kubernetes by executing a Spark Flyte task with the spark-on-k8s-operator fails to include the configured pod security context.

Expected behavior

When plugins.k8s.default-pod-security-context is configured for flytepropeller, I expect to see that configuration reflected in .spec.driver.podSecurityContext and .spec.executor.podSecurityContext of the SparkApplication Kubernetes objects created from running Flyte tasks.

Additional context to reproduce

  1. Ensure you are running a kubernetes cluster with the spark-on-k8s-operator
  2. Apply a configuration file for flytepropeller that includes default-pod-security-context:
plugins:
  k8s:
    default-cpus: 100m
    default-memory: 100Mi
    default-labels:
      app.kubernetes.io/name: flyte
    default-pod-security-context:
      sysctls:
        - name: net.ipv4.tcp_synack_retries
          value: "2"
  spark:
    # -- Spark default configuration
    spark-config-default:
      # We override the default credentials chain provider for Hadoop so that
      # it can use the serviceAccount based IAM role or ec2 metadata based.
      # This is more in line with how AWS works
      - spark.hadoop.fs.s3a.aws.credentials.provider: "com.amazonaws.auth.DefaultAWSCredentialsProviderChain"
      - spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version: "2"
      - spark.master: "k8s://https://kubernetes.default.svc:443"
  1. Run a workflow containing a spark task similar to:
@task(
    task_config=Spark(
        spark_conf={
            "spark.driver.memory": "1000M",
            "spark.executor.memory": "1000M",
            "spark.executor.cores": "1",
            "spark.executor.instances": "2",
            "spark.kubernetes.namespace": "flyte",
            "spark.kubernetes.driver.limit.cores": "1",
            "spark.kubernetes.executor.limit.cores": "1",
        }
    )
)
def spark_test() -> str:
    partitions = 50
    print("Starting Spark with Partitions: {}".format(partitions))
    n = 100000 * partitions
    sess = flytekit.current_context().spark_session
    count = (
        sess.sparkContext.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
    )
    pi_val = 4.0 * count / n
    print("Pi val is :{}".format(pi_val))
    return f"Pi val is: {pi_val}"
  1. Review the resulting SparkApplication object to see the missing podSecurityContext with:
$ kubectl get sparkapplication <workflow>-n0-0 -o yaml

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions