-
Notifications
You must be signed in to change notification settings - Fork 138
add Cancel API to cancel a running job #141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| package river_test | ||
|
|
||
| import ( | ||
| "context" | ||
| "errors" | ||
| "log/slog" | ||
| "time" | ||
|
|
||
| "github.com/jackc/pgx/v5/pgxpool" | ||
|
|
||
| "github.com/riverqueue/river" | ||
| "github.com/riverqueue/river/internal/riverinternaltest" | ||
| "github.com/riverqueue/river/internal/util/slogutil" | ||
| "github.com/riverqueue/river/riverdriver/riverpgxv5" | ||
| ) | ||
|
|
||
| type SleepingArgs struct{} | ||
|
|
||
| func (args SleepingArgs) Kind() string { return "SleepingWorker" } | ||
|
|
||
| type SleepingWorker struct { | ||
| river.WorkerDefaults[CancellingArgs] | ||
| jobChan chan int64 | ||
| } | ||
|
|
||
| func (w *SleepingWorker) Work(ctx context.Context, job *river.Job[CancellingArgs]) error { | ||
| w.jobChan <- job.ID | ||
| select { | ||
| case <-ctx.Done(): | ||
| case <-time.After(5 * time.Second): | ||
| return errors.New("sleeping worker timed out") | ||
| } | ||
| return ctx.Err() | ||
| } | ||
|
|
||
| // Example_cancelJobFromClient demonstrates how to permanently cancel a job from | ||
| // any Client using Cancel. | ||
| func Example_cancelJobFromClient() { | ||
| ctx := context.Background() | ||
|
|
||
| dbPool, err := pgxpool.NewWithConfig(ctx, riverinternaltest.DatabaseConfig("river_testdb_example")) | ||
| if err != nil { | ||
| panic(err) | ||
| } | ||
| defer dbPool.Close() | ||
|
|
||
| // Required for the purpose of this test, but not necessary in real usage. | ||
| if err := riverinternaltest.TruncateRiverTables(ctx, dbPool); err != nil { | ||
| panic(err) | ||
| } | ||
|
|
||
| jobChan := make(chan int64) | ||
|
|
||
| workers := river.NewWorkers() | ||
| river.AddWorker(workers, &SleepingWorker{jobChan: jobChan}) | ||
|
|
||
| riverClient, err := river.NewClient(riverpgxv5.New(dbPool), &river.Config{ | ||
| Logger: slog.New(&slogutil.SlogMessageOnlyHandler{Level: slog.LevelWarn}), | ||
| Queues: map[string]river.QueueConfig{ | ||
| river.QueueDefault: {MaxWorkers: 10}, | ||
| }, | ||
| Workers: workers, | ||
| }) | ||
| if err != nil { | ||
| panic(err) | ||
| } | ||
|
|
||
| // Not strictly needed, but used to help this test wait until job is worked. | ||
| subscribeChan, subscribeCancel := riverClient.Subscribe(river.EventKindJobCancelled) | ||
| defer subscribeCancel() | ||
|
|
||
| if err := riverClient.Start(ctx); err != nil { | ||
| panic(err) | ||
| } | ||
| job, err := riverClient.Insert(ctx, CancellingArgs{ShouldCancel: true}, nil) | ||
| if err != nil { | ||
| panic(err) | ||
| } | ||
| select { | ||
| case <-jobChan: | ||
| case <-time.After(2 * time.Second): | ||
| panic("no jobChan signal received") | ||
| } | ||
|
|
||
| // There is presently no way to wait for the client to be 100% ready, so we | ||
| // sleep for a bit to give it time to start up. This is only needed in this | ||
| // example because we need the notifier to be ready for it to receive the | ||
| // cancellation signal. | ||
| time.Sleep(500 * time.Millisecond) | ||
|
|
||
| if _, err = riverClient.Cancel(ctx, job.ID); err != nil { | ||
| panic(err) | ||
| } | ||
| waitForNJobs(subscribeChan, 1) | ||
|
|
||
| if err := riverClient.Stop(ctx); err != nil { | ||
| panic(err) | ||
| } | ||
|
|
||
| // Output: | ||
| // jobExecutor: job cancelled remotely | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a flaky failure on these tests in this run. There wasn't anything particularly helpful in the failure logs, other than that the job kept running and didn't get cancelled. My guess, however, is that the notifier wasn't up and running before the cancel signal was sent.
I tried reproducing it with thousands of runs locally but couldn't get any failures. Ultimately I added this logic to wait for internal components to come up and be healthy prior to proceeding with the test.
It makes me wonder if there's a more systemic issue here; should we be waiting for some components to come up before returning from
Start()? Or should ourstartClienttest helper always be waiting for the client to become healthy before returning?I'm re-running CI several times just to get more confidence that this fixed the issue, no more failures yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good question. I kind of suspect that ... yes, a
Startthat returned only when things were really healthy and ready to go would overall be better. Shorter term, I like the idea of a test helper that waits for everything to be fully healthy, like you've done here, but we may want to make it even more widespread with an easy way to get at it fromriverinternaltest.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could easily update
startClientto wait for the client to become healthy. That should reduce flakiness and shouldn't impact test timing much as everything starts quickly. However since there is no client health exposed externally yet, it's purely an internal solution for testing.There's the goal of making sure the client is healthy and fully booted before proceeding; this may be desirable sometimes, but other times users may wish to allow the client to finish booting asynchronously while their app does other initialization work. But then there's the separate goal of monitoring the client's health as it operates, and any of these internal components could in theory have issues or connectivity issues during operation.
I'm sure there are other libraries which try to tackle these problems, but one particular instance I have recent experience with is LaunchDarkly's SDK. It's a bit poorly documented and feels overly abstracted / Java-like, but the concepts there are reasonably well-solved and that's what I was intending to do when I initially built the client monitor stuff internally. I don't claim that the design is perfect or finished but at least the general problem is somewhat solved by it. And maybe some of it can/should be baked in as part of
Start()so users don't need to think about it as much.We should probably take a look at this both for our own tests but also for users' sake in follow up PRs.