Refactor fetching of wathola receiver's delivery report using special batch Job by cardil · Pull Request #4460 · knative/eventing

cardil · 2020-11-04T17:27:43Z

This change targets the problem of how to get report from cluster. Clusters may have different networking setup, and it might not be possible to directly make HTTP request from outside of cluster.

Previous approach used to guess an external address of cluster. That for sure fails on OpenShift deployed on AWS.

This approach deploys a special Job that, being inside cluster, can download a report and print it in its logs. Then test client can fetch logs of completed job, and parse it, replay the logs, and process report further.

Fixes #3175
Closes #4430

Proposed Changes

Use K8s job to fetch Wathola report, via job pod's logs
Removal of guessing of node external address

This change targets the problem of how to get report from cluster. Clusters may have different networking setup, and it might not be possible to directly make HTTP request from outside of cluster. Previous approach used to guess an external address of cluster. That for sure fails on OpenShift deployed on AWS. This approach deploys a special Job that, being inside cluster, can download a report and print it in it's logs. Then test client can fetch logs of completed job, and parse it, replay the logs, and process report further.

knative-prow-robot · 2020-11-04T17:27:50Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

codecov · 2020-11-04T17:34:01Z

Codecov Report

Merging #4460 into master will increase coverage by 0.07%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4460      +/-   ##
==========================================
+ Coverage   81.19%   81.27%   +0.07%     
==========================================
  Files         281      282       +1     
  Lines        7981     8004      +23     
==========================================
+ Hits         6480     6505      +25     
  Misses       1112     1112              
+ Partials      389      387       -2

Impacted Files	Coverage Δ
pkg/kncloudevents/message_sender.go	`78.00% <0.00%> (-1.67%)`	⬇️
pkg/channel/message_dispatcher.go	`77.31% <0.00%> (-0.24%)`	⬇️
...econciler/inmemorychannel/dispatcher/controller.go	`78.26% <0.00%> (ø)`
pkg/kncloudevents/http_client.go	`100.00% <0.00%> (ø)`
...iler/inmemorychannel/dispatcher/inmemorychannel.go	`89.39% <0.00%> (+0.86%)`	⬆️
pkg/mtbroker/filter/filter_handler.go	`79.51% <0.00%> (+0.99%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b1706b6...8294d9f. Read the comment docs.

devguyio

/assign
Thanks for doing this @cardil ! Initial review with small comments for the README language. Hope they make sense. Reviewing the rest now

zhongduo · 2020-11-05T14:36:10Z

/assign

Thanks for doing this. This is close to my question in the previous PR about using a "curl" pod to make the report request, like what we have in knative getting started doc. Not sure that can do the same as the fetcher. Another possible way is to simply ask the receiver pod to send out the report once it gets the last event, do you think that is feasible?

cardil · 2020-11-05T16:44:56Z

Re @zhongduo:

I don't think so it's possible.

First of all, there no guarantee that finished event will get to receiver. I saw that happened when using interval of 2ms.

Secondly, how receiver would send the message. It's unlikely that test runner , outside of k8s cluster, be network reachable. It might create a configmap with response, but the he would need to have lube config injected. At that point we can notify him from outside by creating k8s event. That's what was proposed as solution in the issue this addresses.

The approach in this PR is simple and don't require additional kubeconfig injection.

zhongduo · 2020-11-05T16:57:24Z

Re @zhongduo:

I don't think so it's possible.

First of all, there no guarantee that finished event will get to receiver. I saw that happened when using interval of 2ms.

Secondly, how receiver would send the message. It's unlikely that test runner , outside of k8s cluster, be network reachable. It might create a configmap with response, but the he would need to have lube config injected. At that point we can notify him from outside by creating k8s event. That's what was proposed as solution in the issue this addresses.

The approach in this PR is simple and don't require additional kubeconfig injection.

Thanks for the response, make sense. I was more thinking about merging the fetcher and receiver so that the receiver prints out to the logger directly the same way that the fetcher is doing now. And if we couldn't find the log entry, that means sth is wrong and can be considered as an error.

Co-authored-by: Ahmed Abdalla Abdelrehim <aabdelre@redhat.com>

devguyio · 2020-11-06T14:35:27Z

/lgtm

AlexandraRoatis · 2020-11-06T15:07:55Z

Thank you for the thorough explanation and documentation of the changes!

/lgtm

pierDipi

/approve

knative-prow-robot · 2020-11-06T15:12:07Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cardil, devguyio, pierDipi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~test/OWNERS~~ [pierDipi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@devguyio

… batch Job (knative#4460) * Reimplementing fetching of wathola report with K8s job This change targets the problem of how to get report from cluster. Clusters may have different networking setup, and it might not be possible to directly make HTTP request from outside of cluster. Previous approach used to guess an external address of cluster. That for sure fails on OpenShift deployed on AWS. This approach deploys a special Job that, being inside cluster, can download a report and print it in it's logs. Then test client can fetch logs of completed job, and parse it, replay the logs, and process report further. * Removal of unneeded external node address package * Fixing lints & boilerplate * spec.template.spec.restartPolicy=never * Apply @devguyio suggestions for test/upgrade/README.md Co-authored-by: Ahmed Abdalla Abdelrehim <aabdelre@redhat.com> * Changes after review Co-authored-by: Ahmed Abdalla Abdelrehim <aabdelre@redhat.com>

@devguyio

* Eventing upgrade tests prober fully configurable (knative#4421) * Eventing upgrade tests prober fully configurable * Embedding configuration structs * Reduce a test name length to prevent DNS label too long error (knative#4442) Having too long namespace or kservice name can lead to an error like: ``` $ host wathola-receiver-test-continuous-events-propagation-with-prober-zxmkp.apps.example.org host: 'wathola-receiver-test-continuous-events-propagation-with-prober-zxmkp.apps.example.org' is not a legal IDN name (domain label longer than 63 characters), use +noidnin ``` In this case my namespace is test-continuous-events-propagation-with-prober-zxmkp and knative service name is wathola-receiver. The namespace is taken from Go test method name. The limit is 63 characters. In this example the subdomain is 69 characters. This does affect OpenShift Serverless as kservices there have a URL format of `${ksvc.name}-${ksvc.namespace}` to enable usage of TLS wildcard certificates. Reducing this test method name length will help fit within this strict limit of 63 chars. * Use deployment to avoid disparity in effective user (knative#4445) On OpenShift we've observed a disparity when using pods vs deployments. Using both of those can lead to having different effective user for a bare pods and pods managed by deployment. That leads to differences in reading a config file by wathola components, as `~` points to different places sender and receiver+forwarder. This changes the code to avoid using bare pods for wathola components. * Refactor fetching of wathola receiver's delivery report using special batch Job (knative#4460) * Reimplementing fetching of wathola report with K8s job This change targets the problem of how to get report from cluster. Clusters may have different networking setup, and it might not be possible to directly make HTTP request from outside of cluster. Previous approach used to guess an external address of cluster. That for sure fails on OpenShift deployed on AWS. This approach deploys a special Job that, being inside cluster, can download a report and print it in it's logs. Then test client can fetch logs of completed job, and parse it, replay the logs, and process report further. * Removal of unneeded external node address package * Fixing lints & boilerplate * spec.template.spec.restartPolicy=never * Apply @devguyio suggestions for test/upgrade/README.md Co-authored-by: Ahmed Abdalla Abdelrehim <aabdelre@redhat.com> * Changes after review Co-authored-by: Ahmed Abdalla Abdelrehim <aabdelre@redhat.com> Co-authored-by: Ahmed Abdalla Abdelrehim <aabdelre@redhat.com>

knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 4, 2020

google-cla Bot added the cla: yes Indicates the PR's author has signed the CLA. label Nov 4, 2020

knative-prow-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. area/test-and-release Test infrastructure, tests or release labels Nov 4, 2020

knative-prow-robot requested review from steuhs and yt3liu November 4, 2020 17:27

Removal of unneeded external node address package

4c11b23

cardil force-pushed the feature/wathola-fetch-report-by-job branch 2 times, most recently from 12dbfe2 to 5e31ffb Compare November 4, 2020 17:56

Fixing lints & boilerplate

a4124eb

cardil force-pushed the feature/wathola-fetch-report-by-job branch from 5e31ffb to a4124eb Compare November 4, 2020 18:06

cardil changed the title ~~Reimplementing fetching of wathola report with K8s job~~ Refactor fetching of wathola report with K8s job Nov 4, 2020

spec.template.spec.restartPolicy=never

5169ef0

cardil marked this pull request as ready for review November 4, 2020 21:49

knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 4, 2020

cardil changed the title ~~Refactor fetching of wathola report with K8s job~~ Refactor fetching of wathola receiver's delivery report using special batch Job Nov 4, 2020

cardil mentioned this pull request Nov 5, 2020

[release-v0.17.2] Revert "Deploying receiver as ksvc if serving is available" openshift/knative-eventing#948

Merged

devguyio reviewed Nov 5, 2020

View reviewed changes

Comment thread test/upgrade/README.md Outdated

Comment thread test/upgrade/README.md Outdated

Comment thread test/upgrade/README.md Outdated

knative-prow-robot assigned devguyio Nov 5, 2020

cardil mentioned this pull request Nov 5, 2020

[release-v0.17.2] Backport of upstream PR 4460 openshift/knative-eventing#949

Merged

knative-prow-robot assigned zhongduo Nov 5, 2020

zhongduo reviewed Nov 5, 2020

View reviewed changes

Comment thread test/upgrade/prober/wathola/fetcher/operations.go

Comment thread test/upgrade/prober/wathola/fetcher/operations.go

Comment thread test/upgrade/prober/verify.go Outdated

devguyio reviewed Nov 5, 2020

View reviewed changes

Comment thread test/upgrade/prober/verify.go

Comment thread test/upgrade/prober/verify.go Outdated

Comment thread test/upgrade/prober/verify.go Outdated

pierDipi reviewed Nov 5, 2020

View reviewed changes

Comment thread test/upgrade/README.md

cardil and others added 2 commits November 5, 2020 18:50

Apply @devguyio suggestions for test/upgrade/README.md

3a82c28

Co-authored-by: Ahmed Abdalla Abdelrehim <aabdelre@redhat.com>

Changes after review

8294d9f

cardil requested a review from devguyio November 6, 2020 12:44

knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 6, 2020

devguyio approved these changes Nov 6, 2020

View reviewed changes

knative-prow-robot assigned AlexandraRoatis Nov 6, 2020

pierDipi approved these changes Nov 6, 2020

View reviewed changes

knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 6, 2020

knative-prow-robot merged commit daf0d0f into knative:master Nov 6, 2020

cardil deleted the feature/wathola-fetch-report-by-job branch November 6, 2020 16:19

cardil mentioned this pull request Nov 18, 2020

[release 0.18.4] Backports of upgrade tests bits openshift/knative-eventing#982

Merged

zhongduo mentioned this pull request Jan 5, 2021

Add zhongduo (Jimmy Lin) as an approver #4691

Merged

devguyio mentioned this pull request Jan 6, 2021

Add devguyio to approvers #4700

Closed

cardil mentioned this pull request Jan 20, 2022

[WIP] Fetch receiver report directly by Prober #6055

Closed

5 tasks

Conversation

cardil commented Nov 4, 2020

Proposed Changes

Uh oh!

knative-prow-robot commented Nov 4, 2020

Uh oh!

codecov Bot commented Nov 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

devguyio left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhongduo commented Nov 5, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cardil commented Nov 5, 2020

Uh oh!

zhongduo commented Nov 5, 2020

Uh oh!

Uh oh!

devguyio commented Nov 6, 2020

Uh oh!

AlexandraRoatis commented Nov 6, 2020

Uh oh!

pierDipi left a comment

Choose a reason for hiding this comment

Uh oh!

knative-prow-robot commented Nov 6, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov Bot commented Nov 4, 2020 •

edited

Loading