-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Save scheduler execution time by adding new Index idea for dag_run #30827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save scheduler execution time by adding new Index idea for dag_run #30827
Conversation
airflow/models/dagrun.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting idea to add partial index here, Likely the improvment in your case is because the whole index takes too much memory of your server and gest swapped out. I do not know that much about thos partial indexes but it does seem like a legitimate optimisiation that might improve one of the most frequent queries scheduler runs,
I do not see an immediate side-effect of that one (and regarding the migration - I think you should follow the regular CONTRIBUTORS.rst alembic guide: https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#metadata-database-updates
but I think I'd love others to comment if they also see any problem with such an optimisation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your feedback! I moved the Index away from the dagrun.py because then the auto migration failes on the sqllite because not supports NULLS FIRST. Added the Index now only into the migration if not sqllite. Is this the way to go or did i missed something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a bried look ad SQLAlchemy docs, would https://docs.sqlalchemy.org/en/20/core/constraints.html#sqlalchemy.schema.Constraint.dialect_options help in this area? (=dialect_options) to have this specific condition in postgres only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do it this way but then - for consistency - we should change it everywhere else (because we are doing the same approach in multiple other places). I prefer this kind of explicit if actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This index needs to be also declared somewhere in the ORM, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem for me was after I added the Index to the table_args of dagrun alembic tried to add this index automatically during pre-commit and this was failing because sqlite does not handle the NULLS FIRST. So I only added this in the migration script. Is there a better way to handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have pre-commit to generate alembic migrations automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a script to check their names and rename them appriately if they are added - but not to generate them. You have to generate it with alembic (and manually correct possibly) and add manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I added the index again in the table_args . If i added this in the table_args and executing the pre-commit after "alembic revision -m "add new field to db"" step it was failing do to the null errors and thought alembic is automatically trying to convert DB.... Only with removing the index out of table_args and then working on the alembic script it was passing the pre-commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will also have to change 2.6.0 to 2.7.0 references in the migratio name/versions and rebase your PR after we merge #30958 - this one will update the versions (2.6.0 branch is already created and we are releasing the 2.6.0 from that branch now)
|
Now, that 2.7.0dev change has been merged, you will need to rebase and update 2.6 to 2.7 everywhere. |
36ed372 to
0ba2cd1
Compare
AutomationDev85
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint! Updated the code.
AutomationDev85
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok in the CI we are getting now the same error I mentioned before regarding the sqlite and NULLS FIRST:
sqlite3.OperationalError: unsupported use of NULLS FIRST
Has anybody an idea how to handle this?
|
@AutomationDev85 - I've merged another migration, can you plese rebase and resolve the conflict? |
4cd74ca to
8dcd734
Compare
While @AutomationDev85 is on leave I tried to re-base on the latest main. Stumbled a bit around the migration tooling and needed multiple approaches. Hope I made it right now. |
…g_run (apache#30827)" This reverts commit c63b777.
|
Hello @AutomationDev85 -> unfortunately we had to revert that one, as it caused problems for MSSQL index. Can you please redo this one in a new PR ? We already fixed the condition that skipped MSSQL in cases like yours, but your PR has been based on earlier version of CI (it was 20 commits behind). Next time when you open it, it should also run for MSSQL and it should fail with an error there that you will have a chance to fix before we merge it again. That should not happen in the future, we caught that case in one of the recent PRs and fixed it, you just need to base your change on the latest |
Hi airflow community,
this is my third PR and be happy to work on the scheduler runtime again. We faced an issue with slow scheduler execution time by having millions of queued dag_runs for one DAG.
I found out that the query next_dagruns_to_examine needs many seconds to be executed with huge amount of dag_runs. I was able to drop the execution time to some ms by using this Index. This was really a big jump in the performance of the scheduler. Can someone support me how to get this into the migration scripts? What are you thinking about this Index?