update the clusteridentifier in the field for helm values!#345
update the clusteridentifier in the field for helm values!#345Rupam-It wants to merge 16 commits into
Conversation
5f42d45 to
d8796bc
Compare
…sync with service
90b3942 to
2ab6d9a
Compare
| @@ -0,0 +1,198 @@ | |||
| # Cluster Identity: Stable Cluster Identification Across Reinstalls | |||
| // If ClusterToken is not provided but PATToken is, the system will exchange it for a cluster token | ||
| PATToken string `json:"patToken,omitempty"` | ||
|
|
||
| // ClusterIdentifier is an optional stable DNS-label identifier for the cluster. |
There was a problem hiding this comment.
not sure if stable dns label or uuid?
| } | ||
| tempClient := transport.NewDakrClient(dakrURL, "", c.Log) | ||
|
|
||
| // Always use ReattachCluster. |
There was a problem hiding this comment.
If ReattachCluster fails (any error other than CodeNotFound-with-identifier), the code logs and falls through with an empty ClusterToken. The old ExchangePATForClusterToken
path did return fmt.Errorf(...) on failure. This is a regression — the operator will start in a broken state instead of failing loudly.
|
|
||
| // Resolve clusterIdentifier from the well-known cluster identity Secret. | ||
| // The secret name is fixed — users never need to configure it. | ||
| identifier, secretErr := c.readClusterIdentifierFromSecret(ctx, clusterIdentitySecretName) |
There was a problem hiding this comment.
Worth looking into this from this angle maybe:
// Secret exists but key is empty → fatal
return "", fmt.Errorf("cluster identity Secret %q exists but CLUSTER_IDENTIFIER key is missing or empty...")
This is called before anything else in initializeTelemetryComponents, and the error is returned immediately (return secretErr). If the Secret was created manually with a typo,
or the key was accidentally cleared, the operator refuses to start entirely. Given the operator is the sole writer of this Secret, this error case is an operational footgun. A
warning + clearing the in-memory state and continuing (letting the operator re-register) would be safer.
| if dakrURL == "" { | ||
| dakrURL = "https://dakr.devzero.io" | ||
| } | ||
| tempClient := transport.NewDakrClient(dakrURL, "", c.Log) |
There was a problem hiding this comment.
hmm why move away from factory into tnrasport?
| // ReattachCluster registers or reattaches a cluster, returning (token, clusterIdentifier, error). | ||
| // Pass clusterIdentifier=nil on the first call; the backend assigns and returns a UUID. | ||
| // Pass clusterIdentifier=&uuid on subsequent calls to reattach the same cluster. | ||
| func (c *RealDakrClient) ReattachCluster(ctx context.Context, patToken string, clusterIdentifier *string, clusterName, k8sProvider string) (string, string, error) { |
There was a problem hiding this comment.
Also one thing worth keeping in mind:
If ReattachCluster gets CodeNotFound, retries with nil, gets a new UUID, then crashes before persistClusterIdentifierToIdentitySecret completes — on the next restart the
operator reads the old UUID from the identity Secret, hits CodeNotFound again, and creates yet another new cluster. With enough restarts this accumulates orphan cluster
entries. This is an inherent at-least-once delivery trade-off, but the crash window is wide (several lines of code including a network call between clearing the identifier and
persisting the new one)
not sure if we can do somethiung about it or not.
| ) (string, *gen.ClusterSnapshot, error) | ||
| // telemetry_logger.TelemetryLogSender sends a batch of log entries to Dakr | ||
| telemetry_logger.TelemetryLogSender | ||
| // ExchangePATForClusterToken exchanges a PAT token for a cluster token |
There was a problem hiding this comment.
Becaues this was removed whole ExchangePATForClusterToken if we still have it in zxporter somewhere or code around it is dead code? maybe worth checking and removing (only from zxporter not from services)
| @@ -0,0 +1,4 @@ | |||
| # The cluster identity Secret is created and managed automatically by the operator at runtime. | |||
There was a problem hiding this comment.
hmm empty file with just comment to be rendered with k8s? not sure that is good idea?
Tzvonimir
left a comment
There was a problem hiding this comment.
Some changes + lint issue + test failing
Code Review ✅ Approved 4 resolved / 4 findingsImplements cluster identity persistence via dedicated secrets to ensure stable cluster reconnections. Resolved redundant namespace logic and cluster identifier management inconsistencies. ✅ 4 resolved✅ Bug: ReattachCluster is never called despite documentation claiming it
✅ Bug: CLUSTER_IDENTIFIER not guarded by useSecretForToken condition
✅ Quality: Duplicated header-setting logic in ReattachCluster
✅ Quality: Namespace resolution logic duplicated 6+ times without helper
Was this helpful? React with 👍 / 👎 | Gitar |
By default, every time you install or reinstall ZXPorter, it calls
CreateClusterTokenon the DevZero platform — which creates a brand newcluster entry. This means if you uninstall and reinstall ZXPorter on the same physical cluster, the DevZero dashboard shows it as a different
cluster.
Cluster Identity solves this. By giving your cluster a stable identifier, ZXPorter calls
ReattachClusterinstead, which finds orcreates a cluster by that identifier. Reinstalling ZXPorter always connects back to the same cluster entry in the DevZero platform.
for more insight : clusterIdentifier
Summary by Gitar
resolveNamespace()helper to centralize and DRY up namespace detection across controller methods.identifierparameter frompersistClusterToken,persistClusterTokenToConfigMap, andpersistClusterTokenToSecretas cluster identifiers are now exclusively managed via dedicated identity secrets.This will update automatically on new commits.