Skip to content

Conversation

@zhiying-lin
Copy link
Contributor

@zhiying-lin zhiying-lin commented Jun 19, 2023

Description of your changes

Differentiate the error types

  • user error
  • api server error (from cache or api server)
  • unexpected behavior error
  • expected behavior error

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Need to add more UT in the future

Special notes for your reviewer

@zhiying-lin zhiying-lin force-pushed the crp-res branch 4 times, most recently from 8f61e55 to 61381b8 Compare June 26, 2023 03:14
ErrUnexpectedBehavior = errors.New("unexpected behavior which cannot be handled by the controller")

// ErrExpectedBehavior indicates the current situation is expected, which can be recovered by itself after retries.
ErrExpectedBehavior = errors.New("expected behavior which can be recovered by itself")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure if I understand this part, and a little bit clarification would be greatly appreciated.

What would be if there's a resource version conflict (API error)? Is this an API error or an expected error? Similarly, what would be if an error is expected with sporadic occurrences and unexpected if repeated consistently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one way to interpret this is we can alert on too many ExpectedBehaviorError which will be treated the same as "UnExpectedError" while APIServerError will clearly say that there is something wrong with the api-server. We will look more into cluster status.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now there are api server errors, user errors and the rest are expected/unexpected logic errors. They should have no overlap.

agree with Ryan, the alert could be set based on the occurrences or status code returned by the API server.

}
if !r.InformerManager.IsInformerSynced(gvr) {
return nil, fmt.Errorf("informer cache for %+v is not synced yet", restMapping.Resource)
return nil, controller.NewExpectedBehaviorError(fmt.Errorf("informer cache for %+v is not synced yet", restMapping.Resource))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of my confusion specified in an earlier comment. It could be, at this point, that the cache is simply not working fast enough and will get sync'd soon; or it could also be that there is a connection/RBAC issue that will never recover without intervention and needs attention.

@zhiying-lin zhiying-lin merged commit 6954b1d into Azure:main Jun 27, 2023
@zhiying-lin zhiying-lin deleted the crp-res branch June 27, 2023 05:34
@britaniar britaniar mentioned this pull request Jan 12, 2026
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants