Skip to content

Conversation

@etobella
Copy link
Member

@etobella etobella commented Apr 9, 2021

Backport of #305

@simahawk @guewen

@etobella etobella force-pushed the 12.0-bprt branch 2 times, most recently from e8323cb to fd5b8ac Compare April 12, 2021 07:53
@etobella
Copy link
Member Author

Also added #312 #310 and #309

@etobella etobella changed the title 12.0 Backport of Optimization of queue.job creation 12.0 Backport if 13.0 multiple improvements Apr 12, 2021
@etobella etobella changed the title 12.0 Backport if 13.0 multiple improvements [12.0] Backport of 13.0 multiple improvements Apr 12, 2021
Copy link
Contributor

@simahawk simahawk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just code review, LG

Copy link
Member

@guewen guewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@guewen
Copy link
Member

guewen commented Apr 14, 2021

Hi @sbidoul, @gurneyalex,

Runbot fails because of this on the PRs for 12.0:

Unable to find image 'oca-queue:fd5b8ac_env_1_job_2' locally
docker: Error response from daemon: pull access denied for oca-queue, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.

Do you happen to know why? 😌

@guewen
Copy link
Member

guewen commented Apr 14, 2021

Oh maybe I'm looking at the wrong logs and the problem would be phantom-js.

@sbidoul
Copy link
Member

sbidoul commented Apr 14, 2021

Could it be the new(ish) docker hub rate limits that kick in ?

@StefanRijnhart
Copy link
Member

I believe you can squash out the commit from #312 because that was a port forward of #311

queue_job/job.py Outdated
from .exception import (NoSuchJobError,
FailedJobError,
RetryableJobError)
from .exception import FailedJobError, NoSuchJobError, RetryableJobError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about undoing the changes from python-black in this PR? It would make it a lot easier to review.


return ""

@http.route("/queue_job/create_test_job", type="http", auth="user")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange. In 13.0, this code is added by #265 by @nilshamerlinck (and as such also present in #330), but in your git history it is attributed to @guewen as part of 7385fd5.

I'm getting the feeling that maybe these backports should be split up in separate PRs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did something wrong with the commits. I will recheck thanks 😉

Guewen Baconnier and others added 16 commits April 19, 2021 11:03
Context
-------

The queue_job module adds a new type of field, with the type
"job_serialized", which is close to the base "serialized" field,
adding support of serialization for recordsets and a couple of common
types such as datetime and date.

This field type is used in several fields of the "queue.job" model.

In "ir.model.fields", each field is related to a "ttype", which is
a selection containing every types of fields supported by the current
database.

The "job_serialized" field was never registered as a possible value for
the ttype selection field. It seems it had never been an issue before
Odoo 13.0. With Odoo 13.0, we may have this error on upgrades:

    INFO sample odoo.modules.loading: 297 modules loaded in 529.51s, 0 queries
    INFO sample odoo.addons.base.models.ir_model: Deleting 4@ir.model.fields.selection (base.selection__ir_model_fields__ttype__job_serialized)
    ERROR sample odoo.sql_db: bad query: UPDATE "ir_model_fields" SET "ttype"=NULL WHERE "ttype"='job_serialized'
    ERROR: null value in column "ttype" violates not-null constraint
    DETAIL:  Failing row contains (4296, record_ids, null, queue.job, null, null, null, 266, Record, null, null, t, null, null, f, t, f, f, null, base, null, null, t, null, null, null, null, null, t, null, null, 1, 2020-08-11 13:03:09.913127, null, null).

    2020-08-11 13:10:53,168 167 WARNING sample odoo.modules.loading: Transient module states were reset
    2020-08-11 13:10:53,170 167 ERROR sample odoo.modules.registry: Failed to load registry
    Traceback (most recent call last):
      File "/odoo/src/odoo/modules/registry.py", line 86, in new
        odoo.modules.load_modules(registry._db, force_demo, status, update_module)
      File "/odoo/src/odoo/modules/loading.py", line 472, in load_modules
        env['ir.model.data']._process_end(processed_modules)
      File "/odoo/src/odoo/addons/base/models/ir_model.py", line 1990, in _process_end
        record.unlink()
      File "/odoo/src/odoo/addons/base/models/ir_model.py", line 1208, in unlink
        self.env.cr.execute(query, [selection.value])
      File "/odoo/src/odoo/sql_db.py", line 168, in wrapper
        return f(self, *args, **kwargs)
      File "/odoo/src/odoo/sql_db.py", line 245, in execute
        res = self._obj.execute(query, params)
    psycopg2.IntegrityError: null value in column "ttype" violates not-null constraint
    DETAIL:  Failing row contains (4296, record_ids, null, queue.job, null, null, null, 266, Record, null, null, t, null, null, f, t, f, f, null, base, null, null, t, null, null, null, null, null, t, null, null, 1, 2020-08-11 13:03:09.913127, null, null).

Explanation
-----------

In 13.0, a model "ir.model.fields.selection" storing the allowed
selection values for a field has been added
(odoo/odoo@7593b88).

When the "job_serialized" fields for queue_job are created, they are
properly created in "ir.model.fields" with "job_serialized" as "ttype".
It creates the entry in "ir.model.fields.selection".

When we update the module, if it does not find "job_serialized" as a
possible value for IrModelFields.ttype, it unlinks the entry in
"ir.model.fields.selection". When this happens, an override of the
"unlink()" method in "ir.model.fields.selection" forces a NULL value in
the selection fields (here IrModelFields.ttype).

Fixes: OCA#227
Add a field in the "queue_job" model to store the worker pid.
No usage for new besides "logging" for debugging (helps to find which
process to investigate in case a job is stuck).
Later, it may be useful to improve the recovery of job that stay in the
wrong state when the job runner restarts.

Closes OCA#177
* @job and @related_action are still supported but emit warnings and
  will be removed at some point (code to remove has a TODO with
  :job-no-decorator:)
* the options previously set by the decorators are now stored on
  "queue.job.function" records, which can be explicitly created
  as XML records (and then editable!)
* any method can be used with "with_delay()", if no "queue.job.function"
  exists, default values are used (root channel, default related
  action, ...)
As previously, the job functions and channels were created automatically
by "_register_job" on methods decorated by `@job`, databases have
records without xmlids.

To prevent duplicates (or unique constraint error) in case of channels,
the create method automatically merges the new record with the existing
record. This behavior happens only when installing/upgrading a module,
records created manually behave normally.

If 2 modules adds the same channel or job function, they'll be merged
together with 2 xmlids, on uninstallation of one of the module, the
record will be kept thanks to the second xmlid.
As we can delay a job on any method, and the queue.job model is
accessible from RPC (as any model), prevent to:

* create a queue.job using RPC
* write on protected fields (e.g. method name) using RPC

Admittedly, the risk is low since users need have Queue Job Manager
access to create/write on jobs, but it would allow these users to call
internal methods.

The check is done using a context key that must be equal to a sentinel
object, which is impossible to pass through RPC.
A many2one to select the model + a char for the method name is more
convenient and error-proof than having to type the expected format
correctly.
Users shouldn't need access to queue.job.function.
Read config as sudo().
This patch method has to be called in ``_register_hook``.

When a method is patched, any call to the method will not directly
execute the method's body, but will instead enqueue a job.

When a ``context_key`` is set when calling ``_patch_job_auto_delay``,
the patched method is automatically delayed only when this key is
``True`` in the caller's context. It is advised to patch the method
with a ``context_key``, because making the automatic delay *in any
case* can produce nasty and unexpected side effects (e.g. another
module calls the method and expects it to be computed before doing
something else, expecting a result, ...).

A typical use case is when a method in a module we don't control is called
synchronously in the middle of another method, and we'd like all the calls
to this method become asynchronous.

It relies on OCA#274 that deprecates the
`@job` decorator.
It makes the runbot builds fail. There is no urgency to remove the
decorator even if we should incentivize the removal because it is
no longer supported, hence the info log. Also, show only the XML snippet
when debug logs are active, because it is too verbose.
These fields contain the current model and record ids and user id the
job works on.
Storing them as separate fields does not allow to take benefits on the
"JobSerialized" field, which keeps the original user, and later the 'su'
flag. We could store the context there as well (or a cleaned one).

Use computed fields to keep the model_name, record_ids, user_id fields for
backward compatibility. The model_name should stay anyway for the
searches on the UI, and the user_id for the same reason + possibility to
edit the user.
Which means that the record the job is working on, or any record
argument of the job keeps the 'su' flag.
@guewen
Copy link
Member

guewen commented May 5, 2021

Here's the correction for the issue I mentioned before https://github.com/OCA/queue/pull/333/files#r626409959

@guewen
Copy link
Member

guewen commented May 5, 2021

Hi @StefanRijnhart, do you still have concerns or can we merge the backports?

@StefanRijnhart
Copy link
Member

For me this is too much to review at once, and to assess the impact of. But I do trust you fully with that, so don't mind me ;-)

<field name="date_created"/>
<field name="eta"/>
<field name="date_done"/>
<field name="exec_time" widget="float_time" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<field name="exec_time" widget="float_time" />
<field name="exec_time" />

@etobella can you apply this change please? (I did the same for 13.0 in #352)

The float_time widget shows hours seconds, we only want seconds.
The widget had been removed on the form view, but not on the tree view.
@guewen
Copy link
Member

guewen commented Jun 2, 2021

@etobella do you run this on production already?
FYI, as the company I work for is on 12.0 and is a large consumer of jobs, I should deploy this branch soon to put this to test before merging.

@etobella
Copy link
Member Author

etobella commented Jun 2, 2021

I did the backport in order to backport edi, but at the end we did not deploy the change. Sorry 😢

@guewen
Copy link
Member

guewen commented Jun 2, 2021

I did the backport in order to backport edi, but at the end we did not deploy the change. Sorry

No problem, I only asked to assess the risks ;)

@nilshamerlinck
Copy link
Contributor

Hi, fyi I've been running this branch in production for 2 weeks without issue.

@guewen
Copy link
Member

guewen commented Jun 29, 2021

Hi, fyi I've been running this branch in production for 2 weeks without issue.

Thanks for your feedback @nilshamerlinck!
Running well on the test (using production database) service here too.

Thanks @etobella for the backport :)

/ocabot merge major

@OCA-git-bot
Copy link
Contributor

On my way to merge this fine PR!
Prepared branch 12.0-ocabot-merge-pr-333-by-guewen-bump-major, awaiting test results.

@OCA-git-bot OCA-git-bot merged commit 54b37a5 into OCA:12.0 Jun 29, 2021
@OCA-git-bot
Copy link
Contributor

Congratulations, your PR was merged at 80335af. Thanks a lot for contributing to OCA. ❤️

@pedrobaeza
Copy link
Member

pedrobaeza commented Jul 5, 2021

Since the merge of this, I get continuous logs with this problem, and no job is executed:

Details
odoo_1    |   File "/home/odoo/.local/lib/python3.5/site-packages/werkzeug/serving.py", line 270, in run_wsgi
odoo_1    |     execute(self.server.app)               
odoo_1    |   File "/home/odoo/.local/lib/python3.5/site-packages/werkzeug/serving.py", line 258, in execute
odoo_1    |     application_iter = app(environ, start_response)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/service/wsgi_server.py", line 142, in application
odoo_1    |     return application_unproxied(environ, start_response)                                                                                                                                
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/service/wsgi_server.py", line 117, in application_unproxied
odoo_1    |     result = odoo.http.root(environ, start_response)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 1320, in __call__
odoo_1    |     return self.dispatch(environ, start_response)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 1293, in __call__
odoo_1    |     return self.app(environ, start_wrapped)
odoo_1    |   File "/home/odoo/.local/lib/python3.5/site-packages/werkzeug/wsgi.py", line 766, in __call__
odoo_1    |     return self.app(environ, start_response)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 1488, in dispatch
odoo_1    |     result = ir_http._dispatch()
odoo_1    |   File "/opt/odoo/custom/src/odoo/addons/website_sale/models/ir_http.py", line 15, in _dispatch
odoo_1    |     return super(IrHttp, cls)._dispatch()
odoo_1    |   File "/opt/odoo/custom/src/odoo/addons/auth_signup/models/ir_http.py", line 19, in _dispatch
odoo_1    |     return super(Http, cls)._dispatch()
odoo_1    |   File "/opt/odoo/custom/src/odoo/addons/web_editor/models/ir_http.py", line 22, in _dispatch
odoo_1    |     return super(IrHttp, cls)._dispatch()
odoo_1    |   File "/opt/odoo/custom/src/odoo/addons/http_routing/models/ir_http.py", line 403, in _dispatch
odoo_1    |     result = super(IrHttp, cls)._dispatch()
odoo_1    |   File "/opt/odoo/custom/src/odoo/addons/utm/models/ir_http.py", line 26, in _dispatch
odoo_1    |     response = super(IrHttp, cls)._dispatch()
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/addons/base/models/ir_http.py", line 212, in _dispatch
odoo_1    |     return cls._handle_exception(e)
odoo_1    |   File "/opt/odoo/custom/src/odoo/addons/website/models/ir_http.py", line 195, in _handle_exception
odoo_1    |     return super(Http, cls)._handle_exception(exception)
odoo_1    |   File "/opt/odoo/custom/src/odoo/addons/utm/models/ir_http.py", line 31, in _handle_exception
odoo_1    |     response = super(IrHttp, cls)._handle_exception(exc)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/addons/base/models/ir_http.py", line 182, in _handle_exception
odoo_1    |     return request._handle_exception(exception)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 776, in _handle_exception
odoo_1    |     return super(HttpRequest, self)._handle_exception(exception)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 314, in _handle_exception
odoo_1    |     raise pycompat.reraise(type(exception), exception, sys.exc_info()[2])
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/tools/pycompat.py", line 87, in reraise
odoo_1    |     raise value
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/addons/base/models/ir_http.py", line 208, in _dispatch
odoo_1    |     result = request.dispatch()
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 835, in dispatch
odoo_1    |     r = self._call_function(**self.params)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 346, in _call_function
odoo_1    |     return checked_call(self.db, *args, **kwargs)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/service/model.py", line 98, in wrapper
odoo_1    |     return f(dbname, *args, **kwargs)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 339, in checked_call
odoo_1    |     result = self.endpoint(*a, **kw)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 941, in __call__
odoo_1    |     return self.method(*args, **kw)
odoo_1    |   File "/opt/odoo/custom/src/odoo/odoo/http.py", line 519, in response_wrap
odoo_1    |     response = f(*args, **kw)
odoo_1    |   File "/opt/odoo/custom/src/queue/queue_job/controllers/main.py", line 81, in runjob
odoo_1    |     job = Job.load(env, job_uuid)
odoo_1    |   File "/opt/odoo/custom/src/queue/queue_job/job.py", line 252, in load
odoo_1    |     return cls._load_from_db_record(stored)
odoo_1    |   File "/opt/odoo/custom/src/queue/queue_job/job.py", line 263, in _load_from_db_record
odoo_1    |     method = getattr(recordset, method_name)
odoo_1    | AttributeError: 'base' object has no attribute 'confirm_one_invoice' - - -

Can you point which can be the problem?

@guewen
Copy link
Member

guewen commented Jul 5, 2021

@pedrobaeza can you paste the func_string, method_name, and records, args, kwargs json fields of one of these jobs?

@pedrobaeza
Copy link
Member

Thanks for the quick reply. It's this function:

https://github.com/OCA/l10n-spain/blob/5c02c1d9caf61e02f57e2f2a4ec4c675283e92de/l10n_es_aeat_sii/models/account_invoice.py#L1450

and the job is this way:

Selección_030

This is before switching to new version.

Seeing 13.0 version, it seems to not be any difference:

https://github.com/OCA/l10n-spain/blob/23924f08218034c5f6b8f92ed2034e500070dd42/l10n_es_aeat_sii_oca/models/account_move.py#L1452

@StefanRijnhart
Copy link
Member

StefanRijnhart commented Jul 5, 2021

Was this job created before you updated the code of the module and ran the module upgrade?

It looks like https://github.com/OCA/queue/pull/333/files#diff-1c5ee7d63e3224474074dbbdaa46e8d7a838199d2b4d4182726de22fd6e627d1R15-R16 might have been better as an end-migration script, because it might depend on the models and the methods being loaded at the time of migration.

If you query the database for the columns @guewen mentions, we can check this hypothesis.

-- edit -- that method looks safe though https://github.com/OCA/queue/blob/12.0/queue_job/models/queue_job.py#L563-L573

@pedrobaeza
Copy link
Member

Yeah, it can be. For now, I have go back with the previous commit. As soon as I have a slot, I will propose a PR changing it to end-migration and try again to see if that's the problem.

@StefanRijnhart
Copy link
Member

@pedrobaeza not sure that that's it. I have an environment with a lot of jobs on which I can check the migration, so I will post back about that as well.
BTW you might have to do some reverse migration if you want to go back, at least on jobs created after running with the code from this PR.
And it would still be informative to have the SQL values of the func_string, method_name, and records, args, kwargs columns of one of the failing jobs.

@guewen
Copy link
Member

guewen commented Jul 7, 2021

According to the screenshot above, the func_string is correct, so I don't think this migration is the problem.
According to the stack trace, the most likely is that this migration

UPDATE queue_job
SET records = '{"_type": "odoo_recordset"'
|| ', "model": "' || model_name || '"'
|| ', "uid": ' || user_id
|| ', "ids": ' || record_ids
|| '}'
WHERE records IS NULL;
didn't run or for some reason generated a wrong json in the records field. The default value being the base model:
models.BaseModel: lambda env: json.dumps(
{"_type": "odoo_recordset", "model": "base", "ids": [], "uid": env.uid}
),

So:

  1. Check if the migration 12.0.2.1.0 runs
  2. Check and give feedback about the SQL content of the records field (func_string, method_name, args, kwargs may help but I really expect the issue to be in records).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants