Skip to content

appset-secret-plugin Token Authentication Issue #86

@pankajbsn

Description

@pankajbsn

appset-secret-plugin Token Authentication Issue

GitHub Issue Reference

This document describes our experience with token authentication issues in appset-secret-plugin, related to Issue #85.

Problem Description

We encountered persistent HTTP 403 authentication errors between ArgoCD ApplicationSet controller and the appset-secret-plugin, despite both components being properly configured and running.

Symptoms

  1. Intermittent 403 errors in ApplicationSet controller logs:
error listing params: error get api 'gpu-apps-set': API error with status code 403...
  1. Token mismatch errors in plugin logs:
Recieved auth header from argo during gpu-apps-set request: 'Bearer XXX' that does not match
  1. Token kept changing across restarts/syncs:

    • First token: sNhgFnccjblkobG7RZu5HmwvtRcJlORr
    • After sync: WxItG6kLuS4ZHNYQCKXrM5bMVDmG40X9
    • After restart: kn7I6K2bvIdKvZvr4FDUiJ5KhWtNLd7Q
  2. Errors were transient - restarting both ApplicationSet controller and plugin pods would temporarily resolve the issue

  3. Applications still worked - despite errors, the ApplicationSets eventually synced successfully

Root Cause Analysis

The issue is a combination of two problems:

1. Token Regeneration via randAlphaNum

The Helm chart's secret-token.yaml template uses Helm's random generation function:

{{- if not .Values.token.existingSecret }}
apiVersion: v1
kind: Secret
metadata:
  name: {{ include "argocd-appset-secret-plugin.fullname" . }}-token
type: Opaque
data:
  token: {{ randAlphaNum 32 | b64enc | quote }}
{{- end }}

Problem: Every time Helm renders this template (during sync, upgrade, or any ArgoCD reconciliation), randAlphaNum generates a new random token. This causes:

  • ArgoCD ApplicationSet controller to cache one token at startup
  • Plugin pod to cache a different token (generated moments later)
  • Any subsequent Helm operation regenerates the token, invalidating cached values
  • Result: Persistent token mismatches and 403 errors

2. Python Code Continues Despite Errors

The plugin's Python code has error handling issues where exceptions don't properly halt execution:

def do_POST(self):
    # ... authentication check ...
    if api_key != token:
        self.send_error(403, "Forbidden")
        # BUG: Missing return statement here!
        # Code continues executing despite error...

    # ... continues processing request ...

Problem: When authentication fails, the error response is sent but execution continues, leading to:

  • Malformed HTTP responses (status in body instead of headers)
  • Confusing error messages that mix 403 and 200 status codes
  • Unpredictable behavior that makes debugging difficult

This bug is documented in Issue #85.

Solution Implemented

We resolved the token regeneration issue by using the chart's built-in token.existingSecret parameter with a Pulumi-managed stable token.

Implementation Steps

1. Generate Stable Token in Pulumi (Java)

// PhoenixEksCluster.java
this.appsetPluginToken = new RandomPassword(clusterName + "-appset-plugin-token",
    RandomPasswordArgs.builder()
        .length(32)
        .special(false)  // Alphanumeric only (matches Helm chart default)
        .build());

Key advantage: Token is generated once and stored in Pulumi state, remaining stable across all deployments.

2. Pass Token to Bootstrap Script

// AwsUtils.java - createPostEksCommand()
baseConfig.put("APPSET_PLUGIN_TOKEN", appsetToken);

3. Create Kubernetes Secret Before ArgoCD Installation

# post_eks_argocd.sh
kubectl create namespace "${ARGOCD_NAMESPACE}"

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: appset-plugin-token
  namespace: ${ARGOCD_NAMESPACE}
type: Opaque
stringData:
  token: "${APPSET_PLUGIN_TOKEN}"
EOF

4. Configure Helm Chart to Use Existing Secret

# applications/00-bootstrap/appset-secret-plugin.yaml
spec:
  source:
    helm:
      parameters:
        - name: token.existingSecret
          value: appset-plugin-token

Results

After implementing this solution:

Token remains stable across all ArgoCD syncs, Helm upgrades, and pod restarts
No more 403 errors - ApplicationSet controller and plugin always use matching tokens
No manual intervention needed - no more pod restarts to sync token caches
Infrastructure as Code - token managed in Pulumi state, reproducible deployments

Recommendation for Chart Maintainers

The chart should consider one of these approaches to prevent token regeneration:

Option A: Use Helm Lookup Function (Preferred)

Modify secret-token.yaml to preserve existing tokens:

{{- if not .Values.token.existingSecret }}
{{- $existingSecret := lookup "v1" "Secret" .Release.Namespace (include "argocd-appset-secret-plugin.fullname" . | printf "%s-token") }}
apiVersion: v1
kind: Secret
metadata:
  name: {{ include "argocd-appset-secret-plugin.fullname" . }}-token
type: Opaque
data:
  {{- if $existingSecret }}
  token: {{ $existingSecret.data.token }}
  {{- else }}
  token: {{ randAlphaNum 32 | b64enc | quote }}
  {{- end }}
{{- end }}

This preserves the token if the secret already exists, only generating a new one on initial installation.

Option B: Document Token Management

Add prominent documentation recommending users:

  1. Always use token.existingSecret in production
  2. Generate tokens externally (Pulumi, Terraform, SOPS, etc.)
  3. Never rely on the auto-generated token for production use

Option C: Fix Python Code Bugs

Add missing return statements after error responses to prevent execution from continuing:

def do_POST(self):
    # ... authentication check ...
    if api_key != token:
        self.send_error(403, "Forbidden")
        return  # ← Add this!

    # ... rest of processing ...

Related Issues

Environment

  • Plugin Version: 1.4.1
  • Helm Chart Version: 1.4.1
  • ArgoCD Version: 7.7.0
  • Kubernetes Version: 1.34
  • Infrastructure: AWS EKS managed by Pulumi

Conclusion

The token authentication issues in appset-secret-plugin stem from two factors:

  1. Chart design: Using randAlphaNum regenerates tokens on every template render
  2. Code bugs: Missing return statements cause malformed responses

The most reliable solution is to use token.existingSecret with externally managed tokens. This approach:

  • Prevents token regeneration
  • Works around Python code bugs
  • Provides better security through IaC token management
  • Eliminates the need for workarounds or manual interventions

For production deployments, always use token.existingSecret with tokens managed by your infrastructure automation tool (Pulumi, Terraform, etc.) rather than relying on Helm-generated random tokens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions