Skip to content

Conversation

@Arvindthiru
Copy link
Contributor

Description of your changes

Fixes #

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Special notes for your reviewer

@Arvindthiru Arvindthiru changed the title Feat: Handle large annotation limit feat: Handle large annotation limit Nov 22, 2022

const (
workFieldManagerName = "work-api-agent"
fleetSystemNamespace = "fleet-system"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's time for us to create a constants package. We avoided it during the reserved namespace work as we used a prefix. But my current PR is using a string literal for an ObjectKey.

https://github.com/Azure/fleet/pull/346/files#diff-df64e37f2222b761c62bd8bf32c0347aae0fe1f46ff6acf0e66c2b61bd30d761R398

const (
workFieldManagerName = "work-api-agent"
fleetSystemNamespace = "fleet-system"
configMapNamePrefix = "configmap-"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need pkg level constants / vars if they are only used once within the source? I believe I've received that feedback before.

return metadataAccessor.SetAnnotations(obj, annotations)
}

// setModifiedConfigurationAnnotation serializes the object into byte stream and returns it.
Copy link
Contributor

@Ealianis Ealianis Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// setModifiedConfigurationAnnotation serializes the object into byte stream and returns it.
// computeModifiedConfiguration serializes the object into a byte stream and returns it.

generator := names.SimpleNameGenerator
var configMapName string
var configMap v1.ConfigMap
for {
Copy link
Contributor

@michaelawyu michaelawyu Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Arvind! Yeah, I agree with Ealianis' comment; this is not a very good pattern, the for-loop here essentially bypasses the exponential back-off mechanism that controller-runtime provides, and in adverse situations the controller might get trapped in an infinite loop.

Plus, the logic here may not work as you would expect; controller-runtime clients use cached reads; if the cache is not fresh enough, name collisions can still occur.

const (
workFieldManagerName = "work-api-agent"
fleetSystemNamespace = "fleet-system"
configMapNamePrefix = "configmap-"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit here: this prefix might be a little generic.

if err == nil {
klog.V(2).InfoS("successfully created the manifest", "gvr", gvr, "manifest", manifestRef)
// create configmap
if err := r.createConfigMap(ctx, configMapName, owner, manifestHash, lastModifiedConfig); err != nil {
Copy link
Contributor

@michaelawyu michaelawyu Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Arvind! The createConfigMap function is repeatedly called throughout the code, though in actuality it only needs to happen once; a little bit refactoring could help make the workflow and the code more clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or more precisely, this place might not be the best one to do this create call.

Copy link
Contributor

@michaelawyu michaelawyu Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have discussed over the meeting, the name collision check currently in the PR might not work in adverse conditions (e.g. out-of-sync caches), and if the check fails to guard against name collisions, two manifests may share the same configmap and the behavior is undetermined.

Plus, w/o doubly links (or a guaranteed 1:1 name mapping), partial failures may also lead to undetermined behaviors.

// add applied work owner reference. let's check if it works
addOwnerRef(owner, configMap)
if err := r.spokeClient.Create(ctx, configMap); err != nil {
klog.ErrorS(err, "config map create failed", "configMap")
Copy link
Contributor

@Ealianis Ealianis Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks as though multiple error logs will be recorded contain the same message.
See line 716. https://github.com/Azure/fleet/pull/343/files#diff-9c51763423a3a9a026f47bc21555619edb62a1cf676c669b643e57b53c39c7caR716

// We can calculate the manifestHash & lastAppliedConfig again wiping out previous state or make sure the annotations in the last applied config doesn't cause issue in patch.
func (r *ApplyWorkReconciler) migrateToConfigMap(ctx context.Context, gvr schema.GroupVersionResource, obj *unstructured.Unstructured, owner metav1.OwnerReference) error {
// we can also use obj.getAnnotations().
annots, err := metadataAccessor.Annotations(obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Arvind! We might need to double-confirm this; for unstructured objects, this function seems to return a copy of the annotations map, but in this piece of code all the updates are in-place.

See also:
https://github.com/kubernetes/apimachinery/blob/4fbe8e4db39b17a53c0756caf58466baad871871/pkg/apis/meta/v1/unstructured/unstructured.go#L407
https://github.com/kubernetes/apimachinery/blob/4fbe8e4db39b17a53c0756caf58466baad871871/pkg/apis/meta/v1/unstructured/helpers.go#L166

if ok1 || ok2 {
delete(annots, manifestHashAnnotation)
delete(annots, lastAppliedConfigAnnotation)
configMapName, err := r.getRandomConfigMapName(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(See the comment above about refactoring the createConfigMap call; a little bit refactoring can help here)

if !ok {
return nil, errors.New("manifest doesn't have a config map")
}
if err := r.spokeClient.Get(ctx, types.NamespacedName{Name: configMapName, Namespace: fleetSystemNamespace}, &configMap); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Arvind! I understand the purpose of this code snippet here, but it could really use some refactoring.

  • The createConfigMap call happens for the third time
  • It really is not a very good idea to use recursion here.

setConfigMapNameAnnotation(curObjConfigMap.GetName(), manifestObj)

// Use manifest generation to check if configmap is outdated
if curObj.GetGeneration() != curObjConfigMap.GetGeneration() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Arvind! A few issues here:

  • first of all, IIRC Kubernetes API server does not track generations for configMaps; the field will always be 0
  • This checks compares the generations of two very different objects, which will only work if the two values are in-sync (there are no guarantees)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, was checking the logs and realized the generation for config map is always zero and the manifest's generation is also zero after patch

// We only try to update the object if its spec hash value has changed.
if manifestObj.GetAnnotations()[manifestHashAnnotation] != curObj.GetAnnotations()[manifestHashAnnotation] {
return r.patchCurrentResource(ctx, gvr, manifestObj, curObj)
if manifestHash != curObjConfigMap.Data[manifestHashKey] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Arvind! This branch will never run if I understand it correctly; the code above just has the configMap updates so the hashes will always match, right?

Copy link
Contributor Author

@Arvindthiru Arvindthiru Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the configmap would get updated only if it's outdated meaning a we patched the object but for some reason we were not able to update the configmap

}

// calculate manifest hash & last modified config again to set the new manifest hash & last applied config after patch gets applied
if err := r.updateConfigMap(ctx, curObjConfigMap, manifestObj); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Arvind! Just to double-check, is it really necessary to update the configmap for a second time here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This need to be updated with the object from the hub cluster

// threeWayMergePatch creates a patch by computing a three-way diff based on
// an object's current state, modified state, and last-applied-state recorded in its annotation.
func threeWayMergePatch(currentObj, manifestObj client.Object) (client.Patch, error) {
func threeWayMergePatch(currentObj, manifestObj client.Object, currentObjConfigMap *v1.ConfigMap) (client.Patch, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Arvind! It might be helpful to take another look at the implementation of the CreateThreeWayMergePatch function; the semantics do not seem to correlate with the args passed in at this moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in the meeting, the args match

@Arvindthiru Arvindthiru closed this Feb 2, 2023
weng271190436 pushed a commit to weng271190436/fleet that referenced this pull request Dec 1, 2025
…xt exits (Azure#343)

* Minor changes

Signed-off-by: michaelawyu <chenyu1@microsoft.com>

* Minor fixes

Signed-off-by: michaelawyu <chenyu1@microsoft.com>

* Minor fixes

Signed-off-by: michaelawyu <chenyu1@microsoft.com>

---------

Signed-off-by: michaelawyu <chenyu1@microsoft.com>
@weng271190436 weng271190436 mentioned this pull request Dec 1, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants