Skip to content

Refactor watch functions to pass resourceVersion #1287

@dsimansk

Description

@dsimansk

Let me add a few thoughts after digging into watch API a bit.

  1. The CPU is pegged
  2. time.After creates a new timer on each invocation. The timers aren't garbage collected until they expire, resulting in a crippling memory leak.

That's a neat fix. Thanks for pointing that out. The go docs are pretty clear there.

1. The code retries even though the results channel is never recreated, meaning the wait will never succeed

Indeed you're correct here as well. With the fix for watcher race condition we actually lost the ability to recreate the watcher.
Even though the wait loop checks for !ok channel and it's the only place producing retry=true, it's ineffective without remove initialization code.

https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes

When retrieving a collection of resources (either namespace or cluster scoped), the response from the server will contain a resourceVersion value that can be used to initiate a watch against the server. The server will return all changes (creates, deletes, and updates) that occur after the supplied resourceVersion. This allows a client to fetch the current state and then watch for changes without missing any updates.

As pointed out by @markusthoemmes, the resourceVersion can be used to avoid races like that, see the k8s docs above.

Therefore I think the right fix for the race condition is to leverage resourceVersion in our watch request. I've done a bit of testing and it seems to behave as expected, even with synthetic timeout postponing the watcher creation the events are lost. That'd make room to bring back the correct retry behaviour when channel is closed. As an good exampple and inspiration client-go's retrywatcher is done the same way.

Originally posted by @dsimansk in #1263 (comment)

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions