-
-
Notifications
You must be signed in to change notification settings - Fork 535
Description
Some details may be inaccurate as I didn't verify all my assumptions.
Implementation Draft
First let's state what we would expect from dependencies
Design assumptions
- We can delay a job A with a dependency on another job Z: A will start only when Z is in state 'done'
- We can delay jobs A and B with a dependency on another job Z: both A and B will start only when Z is in state 'done'
- We can delay a job A with a dependency on other jobs Z and Y: A will start only when Z and Y are in state 'done'
- We can delay jobs A and B with a dependency on other jobs Z and Y: A will start only when Z and Y are in state 'done'
- When job Z finishes and A depends on it, A will get the result of Z
- When job Z and Y finishes and A depends on them, A will get a list with the results of Z and Y
- When job Z and Y finishes and A and B depends on them, A and B will get a list with the results of Z and Y
Assumptions 1. and 2. are probably the easiest to implement, and we could start with them, but we have to think the datamodel and API to be sure we can do 3. and 4. as well.
Datamodel and workflow
- Add a new field on queue_job:
dependencies. This is a JSON field with a dict expressing the dependencies, with 2 keys:depends_on,reverse_depends_on. We don't want FKs, jobs are not supposed to be deleted and if they are, we just have to ignore the irrelevant references. - Add a new field for the result that is passed to the dependencies, JSON serialized (we can pass recordsets, ...). As we currently store the result as a message for the end-user, not sure how we should handle it
- We add a new state
wait_dependency, these jobs will be ignored by the jobrunner out of the box - When a job finishes, the job runner looks if the job has a non-empty
reverse_depends_onlist. If it finds dependent jobs, for each job inreverse_depends_on, it has to read thedepends_onlist, check if all the jobs in this list are done and in such case, change the state of the dependent job to 'pending'.
Using the denormalization with a reverse_depends_on field means that we only never have to run a SELECT ... WHERE depends_on = ? to find the dependencies. We have a cost penalty (besides the cost to read reverse_depends_on) only when we actually have dependencies.
At first, it does looks not too difficult, the difficulty here lies in the transactions. If a job A depends on Z, when Z is done, it will check the state of the dependencies, Z being the only dependency and being done, we are good. Now if A depends on Z and Y, if Z finishes, it has to check if Y is done as well. Y may or may not be dead done, but Z being isolated in a transaction will not know the real state of Y. Changing the isolation level would not help as Y can be 'done' then rollbacked. I thought about using postgres 2-phases commit - not sure we can integration it properly in odoo - or opening a new transaction after Z is committed (and take the risk that the second transaction fails, then handle retries, ...).
API
On the API side, I thought about different options too, I remember I had a preference and some ideas to actually implement them but can't remember ;)
There is much more thoughts to put here, also considering that dependent jobs may receive a result from the jobs they depend on.
Dependencies API
There is no support of then (executed after done or failure), only done (executed after done).
The "failure" callback is handled synchronously in Celery because you may want to use the traceback from it. This is not drafted here neither. Both of these use cases can be added later, we only have to keep in mind the design to support them (keep this info in the dependencies json field.
Comments in the code explain the additions.
from odoo import _, api, fields, models
from odoo.addons.queue_job.job import job, chain, group
class ProductProduct(models.Model):
_inherit = 'product.product'
@api.job
def generate_thumbnail(self, size):
return self._gen(size)
@api.multi
def button_generate_usual(self):
# This is the current implementation, which will not change
self.ensure_one()
self.with_delay(
priority=30,
description=_("generate xxx")
).generate_thumbnail((50, 50))
@api.multi
def button_generate_simple_with_delayable(self):
self.ensure_one()
# Introduction of a delayable object, using a builder pattern
# allowing to chain jobs or set properties. The delay() method
# on the delayable object actually stores the delayable objects
# in the queue_job table
(
self.delayable()
.generate_thumbnail((50, 50))
.set(priority=30)
.set(description=_("generate xxx"))
.delay()
)
@api.multi
def button_generate_chain(self):
product1 = self.browse(1)
product2 = self.browse(2)
product3 = self.browse(3)
self.ensure_one()
# the delayable object can be chained
job1 = (
product1.delayable().generate_thumbnail((50, 50))
.set(description=_("generate xxx"))
)
job2 = (
product2.delayable().generate_thumbnail((50, 50))
.set(description=_("generate xxx")),
)
job3 = (
product3.delayable().generate_thumbnail((50, 50))
.set(description=_("generate xxx")),
)
# the builder pattern allow to set properties at once
for job_ in (job1, job2, job3):
# job.set(xxx) is better than job.priority = 10
# or than job.priority(10), because it allows to pass
# a splat dictionary set(**properties) or custom properties
job.set(priority=30)
# Or self.delay_chain() as syntactic sugar, calling chain
# under the hood.
# .delay() setup the links in the chain and stores the job
chain(
job1,
# job2 is executed after product1
job2,
# job3 is executed after product2
job3
).delay()
@api.multi
def button_generate_chain_done(self):
product1 = self.browse(1)
product2 = self.browse(2)
product3 = self.browse(3)
self.ensure_one()
job1 = product1.delayable().generate_thumbnail((50, 50))
job2 = product2.delayable().generate_thumbnail((50, 50))
job3 = product3.delayable().generate_thumbnail((50, 50))
# similar than the chain
job1.on_done(job2.on_done(job3)).delay()
@api.multi
def button_generate_done_multi(self):
product1 = self.browse(1)
product2 = self.browse(2)
product3 = self.browse(3)
self.ensure_one()
job1 = product1.delayable().generate_thumbnail((50, 50))
job2 = product2.delayable().generate_thumbnail((50, 50))
job3 = product3.delayable().generate_thumbnail((50, 50))
# job2 and job3 executed only when job1 is done
job1.on_done(job2, job3).delay()
class Album(models.Model):
_name = 'album'
@api.job
def generate_page(self, results, color, images_per_page=10):
return self._gen(results, images_per_page)
@api.multi
def button_generate(self):
self.ensure_one()
product1 = self.env['product.product'].browse(1)
product2 = self.env['product.product'].browse(2)
product3 = self.env['product.product'].browse(3)
job1 = (
product1.delayable().generate_thumbnail((50, 50))
.set(priority=30)
.set(description=_("generate xxx"))
)
job2 = (
product2.delayable().generate_thumbnail((50, 50))
.set(priority=30)
.set(description=_("generate xxx"))
)
job3 = (
product3.delayable().generate_thumbnail((50, 50))
.set(priority=30)
.set(description=_("generate xxx"))
)
# could have self.delay_group() in base.model as syntactic sugar
group(job1, job2, job3).on_done(
# executed only when job1, job2 and job3 are done
self.delayable().generate_page('#000000', images_per_page=12)
.set(priority=12)
.set(description=_("generate page x"))
# delay makes the links and store in db
).delay()Results API
Still do define
# jobs definitions
from odoo.addons.queue_job import Result
@job
def render_thumbnail(self):
# could use a class to differentiate the current user message and
# the result for the dependent job(s)
thumbnail = self.env['thumbnail'].do_stuff()
return Result(message='thumbnail generation is a success',
results=thumbnail)