-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Enable already existing "task.scheduled_duration" metric #46009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable already existing "task.scheduled_duration" metric #46009
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 8 out of 23 changed files in this pull request and generated no comments.
Files not reviewed (15)
- airflow/api_fastapi/core_api/openapi/v1-generated.yaml: Language not supported
- docs/apache-airflow/img/airflow_erd.sha256: Language not supported
- docs/apache-airflow/migrations-ref.rst: Language not supported
- airflow/api_fastapi/core_api/datamodels/task_instances.py: Evaluated as low risk
- airflow/api_connexion/schemas/task_instance_schema.py: Evaluated as low risk
- tests/api_connexion/endpoints/test_task_instance_endpoint.py: Evaluated as low risk
- airflow/models/dag.py: Evaluated as low risk
- tests/api_connexion/schemas/test_task_instance_schema.py: Evaluated as low risk
- airflow/models/dagrun.py: Evaluated as low risk
- airflow/models/taskinstancehistory.py: Evaluated as low risk
- tests/api_connexion/endpoints/test_mapped_task_instance_endpoint.py: Evaluated as low risk
- airflow/models/taskinstance.py: Evaluated as low risk
- tests/api_fastapi/core_api/routes/public/test_task_instances.py: Evaluated as low risk
- airflow/models/trigger.py: Evaluated as low risk
- airflow/triggers/base.py: Evaluated as low risk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this looks good and reasonable. But before merging... as I am not an expert in Scheduler details... would request another pair of eyes for review.
(Anyway the merge conflict needs to be resolved before merge...)
Does this mean the metric was never produced in Airflow 2.x? |
Yes, it was defined but never filled - as the scheduled time was not recorded. Only the queued time was written into the DB then the metric was able to be produced. Good overview is in #30612 |
#45285 reports it was rarely sent. Not that it wasn't sent at all. That's odd. |
eae3f28 to
79cf999
Compare
I don't know in which cases it "rarely" was sent - @AutomationDev85 do you know in which cases the metric was emitted in the past? @eladkal does anything speak against merging this for Airflow 3 to make it "proper"? (e.g. additional column that adds overhead?) |
|
@jscheffl That is a good question. I´m not sure how this metric should be rarely exported. In the current implementation the metric would be exported if the start_date is available but from my point of view this start_date will be set after the task went from scheduled into queued state. May be if a task reruns and the task_instance includes already a start_date and it is not moved into the task_instance_history then this metric could be exported but this value is wrong for this metric. But I do not know in which case we can end in such kind of state. :( |
79cf999 to
6dd047d
Compare
6dd047d to
1e06124
Compare
1e06124 to
b3071b8
Compare
|
Nobody is objection - I thought it is a bit more of a discussion - I propose to merge this as an improvement for observability with the trade-off of an additional column in the DB |
* Enable scheduled_dttm field in task_instance * Fixed unit tests * Fixed Unit test * Removed comment * Fixed unit tests --------- Co-authored-by: Marco Küttelwesch <marco.kuettelwesch@de.bosch.com>
* Remove Alembic migration autogenerated comment in #46009 * Add new erd hash
* Remove Alembic migration autogenerated comment in apache#46009 * Add new erd hash
* Remove Alembic migration autogenerated comment in apache#46009 * Add new erd hash
* Enable scheduled_dttm field in task_instance * Fixed unit tests * Fixed Unit test * Removed comment * Fixed unit tests --------- Co-authored-by: Marco Küttelwesch <marco.kuettelwesch@de.bosch.com>
* Remove Alembic migration autogenerated comment in apache#46009 * Add new erd hash
* Enable scheduled_dttm field in task_instance * Fixed unit tests * Fixed Unit test * Removed comment * Fixed unit tests --------- Co-authored-by: Marco Küttelwesch <marco.kuettelwesch@de.bosch.com>
* Remove Alembic migration autogenerated comment in apache#46009 * Add new erd hash
Overview
This PR enables the task.scheduled_duration metric. It is already described here:
https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/metrics.html#timers
Search for "task.scheduled_duration"
The metric is not working at the moment as the scheduled_dttm field is missing in the task_instance table.
I hope I catched all points in the code were the task_instance is set to scheduled state to collect the right time and that I understood the metric in the right way.
Idea of this PR to keep it as simple as possible to enable this metric.
Details of changes:
Looking forward to start the discussion about this metric and getting this live with Airflow 3.
Relates: #30612 #34493 #34771