-
-
Notifications
You must be signed in to change notification settings - Fork 532
[FIX] [16.0] queue_job: Add requeue default config parameter for started_delta + improve README #642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
11f845e to
3991b80
Compare
|
@simahawk @gurneyalex you are welcome! 😄 |
yajo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few typos, but it all looks ok.
3991b80 to
3c8922b
Compare
sbidoul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you can have multiple Odoo instances running jobs and crons on different machines, there is actually no guarantee that the pid stored on the job is a pid of the machine trying to kill it. So it may be ineffective, or worse, killing an unrelated pid.
So I'm afraid we can't do this.
|
Yikes, true. However it's important to understand that we currently have a different problem. Imagine a job that takes 10 minutes to execute. Maybe because it's slow, or maybe because it's buggy (e.g. a request without a timeout). After 5 minutes, the cron runs and reschedules it. Then it is picked up by another worker. Since the 1st worker still didn't end, the job will run twice. That's a race condition. We could check the PID is currently running and belongs to an odoo process before terminating it. Also, we could add this option as a parameter into the cron function directly (False by default). Benefits:
I'd like a better solution, such as being able to check if the job is actually running or not. But then the jobrunner should start 2 threads, one of which would be a keepalive one, or something like that. Way more complex... But do you have any other ideas? |
There was this idea of taking a lock on the job record (#423), so we can know for sure that some worker somewhere is still processing the job. I think it is feasible but it is tricky to get right. Also, with the current implementation, if you configure the cron so the delay for re-queuing is greater than the CPU time limit of the odoo jobs workers, then you can be sure that the job will have been killed before being requeued. |
|
With psutil library, we could hash some pid information (pid and create_time at least) and store on the job to ensure we are killing the right. Then we can check the process to ensure is running and is not zombie. Parameter started_delta=0 passed to the function must be tweaked in every instance to ensure this won't trigger until Odoo has killed his own process (set on the parameters of the environment) |
|
However, if we keep this in mind:
Knowing that Odoo will kill the worker after exceeding its allowed time, do we really need to kill it ourselves? Can't we just assume it's being killed and focus on properly rescheduling it? |
Yes, I think is not needed :/ I'm going to update README properly to include job reset configuration |
3c8922b to
f0470e3
Compare
|
This PR has the |
f0470e3 to
c01eed5
Compare
49bee5b to
24f0dcd
Compare
24f0dcd to
cd78484
Compare
|
All ready |
|
This PR has the |
rafaelbn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Shide !
|
@guewen can you merge this with your bless? 🙌🏻 |
simahawk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
/ocabot merge patch Thanks! |
|
This PR looks fantastic, let's merge it! |
|
Congratulations, your PR was merged at dbfd111. Thanks a lot for contributing to OCA. ❤️ |
Added requeue documentation to readme file
MT-5357 @moduon @yajo @rafaelbn @sbidoul @guewen please review if you want 😄