Skip to content

Conversation

@guewen
Copy link
Member

@guewen guewen commented Jul 4, 2019

Issue described in #129.

Introduction

This PR adds a way to express dependencies between jobs, so called jobs graphs.

The delaying API has been rewritten in a way that allows to express these dependencies. Under the hood, an acyclic direct graph is built and stored in the job records. Jobs which depend on another or several other jobs are in a state wait_dependencies, when a job is done, it finds the dependent jobs, which is set to pending if all the other dependent jobs are done as well.

API changes

Single jobs

The new API is entirely backward compatible. Delaying a single job is done as before with:

records.with_delay(**job_options).my_method(*args, **kwargs)

A new concept of delayable is introduced to compose graph, the above is equivalent to:

delayable = records.delayable(**job_options)
delayable.my_method(*args, **kwargs)
delayable.delay()

Chains

A very simple graph, execute a job when the first is done, can be written as:

initial_delayable = records.delayable(**job_options)
initial_delayable.my_method(*args, **kwargs)
initial_delayable.on_done(records.delayable().my_other_method())
initial_delayable.delay()

.delay() is called only on the root of the graph.

If several jobs are chained, it can be easier to build a chain which are executed in sequence:

from odoo.addons.queue_job.delay import chain

chain(
    records.delayable().my_method(),
    records.delayable().my_second_method(),
   records.delayable().my_third_method(),
).delay()

Groups

Dependencies permit depending on a group of jobs, the third method is executed when the two jobs of the group are done:

from odoo.addons.queue_job.delay import group

group(
    records.delayable().my_method(),
   records.delayable().my_second_method(),
).on_done(records.delayable().my_third_method()).delay()

UI for graphs

Changes for the graphs and some various tweaks:

  • The jobs of a graph all share a "Graph UUID". We can search and group jobs per Graph UUID.
  • From the form view of a job, we can quickly open all the other jobs of the same graph
  • A new widget shows a visualization of the graph in the form view
  • The UUID, graph UUID, enqueued time and func_string are now shown only in debug mode

Improved testing

Unit tests

A new testing facility is added, which helps a lot to run assertions on enqueued jobs. It is strongly encouraged to write unit tests using this assert. It works on any jobs, this is not specially related to graphs.

Inside the mock_jobs context manager, jobs are never stored in database. However, they are kept in memory in the context manager instance, which allow running checks on them.

    # tests
    from odoo.addons.queue_job.tests.common import mock_jobs


    def test_method_to_test(self):
        with mock_jobs() as jobs_tester:
            result = self.env["model"].method_to_test()
            expected_count = 12

            jobs_tester.assert_jobs_count(1, only=self.env["model"].my_job_method)
            jobs_tester.assert_enqueued_job(
                self.env["model"].my_job_method,
                args=("Hi!",),
                kwargs=dict(count=expected_count),
                properties=dict(priority=15)
            )
            self.assertEqual(result, expected_count)

Output of failed assertion

Additionally, as the jobs are stored in the context manager instance, we can even choose to execute them in the test or not, depending on what we are testing.

             jobs_tester.perform_enqueued_jobs()

Test jobs

The controller /queue_job/create_test_job can now generate a random graph of jobs, using the parameters:

  • priority: priority of the created jobs
  • max_retries: max_retries of the created jobs
  • channel: channel of the created jobs
  • description: you guess what
  • size: number of jobs delayed, if 1, it will not be a graph
  • failure_rate: between 0 and 1, if > 0 jobs will randomly fail according to this rate

Warnings

As of today, there is no correct handling of identity keys for graphs. We still need to figure out the most correct way to handle an identity key for a graph. However, the identity key still work the same way for single jobs.

Closes #129

@guewen guewen force-pushed the 12.0-dependencies branch 2 times, most recently from 93e3246 to c1ded7a Compare July 11, 2019 20:10
@guewen
Copy link
Member Author

guewen commented Oct 7, 2019

I pushed new commits. I still have lots of docs to write and not everything is ready yet, but I would welcome feedback by people interested in the feature :) You can look at the doc of the 'base' model methods and at the tests to have an idea of the dependency API.

def load_many(cls, env, job_uuids):
"""Read jobs in batch from the Database
Jobs not found are ignored.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: log a warning

@guewen
Copy link
Member Author

guewen commented Oct 7, 2019

Something I'm really not sure about is how to deal with identity keys. Currently, I'm checking that if a graph contains some jobs with an identity key, if all the jobs in the graph with an identity key have another job in the db with the same key, then the whole graph is discarded. It doesn't seem very intuitive. What would make most sense to me is to have a single identity key for the whole graph. I'll think about how we could implement this.

@guewen
Copy link
Member Author

guewen commented Mar 13, 2020

Remaining

  • use a single uuid for identifying a graph
  • handling of identity keys (not really defined yet)
  • handle deprecation of a few things
  • implement graph traversal to execute job synchronously when TEST_QUEUE_JOB_NO_DELAY is used
  • take care of setting the proper state on requeued children jobs when their parents are not done
  • docs
  • extend test coverage (I don't remember what is/isn't tested)
  • migration to 13.0
  • wish: performance analysis, if time allows...
  • no "Result" will be handled yet, it can be added later

@guewen guewen force-pushed the 12.0-dependencies branch 13 times, most recently from 3bc3f63 to 277d247 Compare May 30, 2021 18:49
@guewen guewen force-pushed the 12.0-dependencies branch 2 times, most recently from 7cfa275 to 393259c Compare June 30, 2021 15:11
@etobella
Copy link
Member

@guewen Do you need some help with this?

@guewen guewen force-pushed the 12.0-dependencies branch from 5908baa to ffda53d Compare July 21, 2021 09:41
@guewen
Copy link
Member Author

guewen commented Jul 21, 2021

@etobella The only thing yet to do I think is about handling the identity key for a whole graph. Well, there is an implementation now which is that if all jobs have an identity key and all these identity keys already exist, we skip the graph. I don't think this is how it should be done, rather have a single identity key for the whole graph.

Anyway, I think this could be handled later, the "how" could become more clear at usage.

This PR is in a safe enough state to be tested if you'd like :) I should open my PR soon, I'd like to improve the description first 😆

@simahawk
Copy link
Contributor

Beautiful work, as usual :)
Question: any chance to add this feature to a separated module?
It's quite big as change and it will be easier to maintain (and to maintain the core) if it stays separated.

@guewen
Copy link
Member Author

guewen commented Jul 21, 2021

Question: any chance to add this feature to a separated module?

This is not something "on top" of the existing API, but an improvement of the current API to support graphs. Enqueuing a single job is now (aside a few details) enqueueing a graph of 1 job, the mechanism is generic for supporting both (with_delay goes through the graph machinery - syntactic sugar when you want only 1 job -, but the performance is the same as today when you have 1 job). In other words: this is the core now ;)

@guewen
Copy link
Member Author

guewen commented May 3, 2022

@simahawk version for 14.0 is there: #403
I didn't work on any other version yet.

guewen added a commit to guewen/queue that referenced this pull request May 3, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
guewen added a commit to guewen/queue that referenced this pull request May 31, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
guewen added a commit to guewen/queue that referenced this pull request May 31, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
ar-tameson pushed a commit to ar-tameson/queue that referenced this pull request Sep 7, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
guewen added a commit to guewen/queue that referenced this pull request Oct 26, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
guewen added a commit to guewen/queue that referenced this pull request Nov 1, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
lmignon pushed a commit to acsone/queue that referenced this pull request Nov 16, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
lmignon pushed a commit to acsone/queue that referenced this pull request Nov 16, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
lmignon pushed a commit to acsone/queue that referenced this pull request Nov 16, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
lmignon pushed a commit to acsone/queue that referenced this pull request Nov 16, 2022
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
bizzappdev pushed a commit to BizzAppDev-Systems/queue that referenced this pull request Nov 23, 2023
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
bizzappdev pushed a commit to BizzAppDev-Systems/queue that referenced this pull request Nov 23, 2023
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
nguyenminhchien pushed a commit to nguyenminhchien/queue that referenced this pull request Nov 25, 2023
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
yibudak pushed a commit to yibudak/queue that referenced this pull request Nov 27, 2023
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
yibudak pushed a commit to yibudak/queue that referenced this pull request Nov 27, 2023
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
nguyenminhchien pushed a commit to nguyenminhchien/queue that referenced this pull request Nov 29, 2023
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
nguyenminhchien pushed a commit to nguyenminhchien/queue that referenced this pull request Dec 4, 2023
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
QuocDuong1306 pushed a commit to QuocDuong1306/queue that referenced this pull request Sep 19, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
QuocDuong1306 pushed a commit to QuocDuong1306/queue that referenced this pull request Sep 24, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
QuocDuong1306 pushed a commit to QuocDuong1306/queue that referenced this pull request Sep 24, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
QuocDuong1306 pushed a commit to QuocDuong1306/queue that referenced this pull request Sep 24, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
thienvh332 pushed a commit to thienvh332/queue that referenced this pull request Oct 7, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
thienvh332 pushed a commit to thienvh332/queue that referenced this pull request Oct 7, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
thienvh332 pushed a commit to thienvh332/queue that referenced this pull request Oct 7, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
thienvh332 pushed a commit to thienvh332/queue that referenced this pull request Oct 8, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
thienvh332 pushed a commit to thienvh332/queue that referenced this pull request Oct 9, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
thienvh332 pushed a commit to thienvh332/queue that referenced this pull request Oct 11, 2024
on_done() is maybe a better choice, it is less ambiguous than then() (clearer
on the fact it happens on done()) and done() (less confusion with set_done()).
I'll do the renaming soon, unless you have other suggestions.

Ref: OCA#154 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Job dependencies

6 participants