Skip to content

Conversation

@Fokko
Copy link
Contributor

@Fokko Fokko commented Oct 27, 2019

Due to parallel work, we have split heads in the alembic migrations.

Before:

b0125267960b, b3b105409875 -> 0b2d9a271263 (head) (mergepoint), Merge 74effc47d867 and b3b105409875
004c1210f153, 74effc47d867 -> b0125267960b (mergepoint), merge 004c1210f153 and 74effc47d867
939bb1e647c8 -> 004c1210f153, increase queue name size limit
6e96a59344a4 -> 74effc47d867, change datetime to datetime2(6) on MSSQL tables
d38e04c12aa2 -> b3b105409875, add root_dag_id to DAG
Revision ID: b3b105409875
Revises: d38e04c12aa2
Create Date: 2019-09-28 23:20:01.744775
6e96a59344a4 -> d38e04c12aa2, add serialized_dag table
939bb1e647c8 -> 6e96a59344a4 (branchpoint), Make TaskInstance.pool not nullable
4ebbffe0a39a -> 939bb1e647c8 (branchpoint), task reschedule fk on cascade delete
dd4ecb8fbee3, cf5dc11e79ad, a56c9515abdc -> 4ebbffe0a39a (mergepoint), Merge heads
c8ffec048a3b -> dd4ecb8fbee3, Add schedule interval to dag
41f5f12752f8 -> cf5dc11e79ad, drop_user_and_chart
c8ffec048a3b -> a56c9515abdc, Remove dag_stat table
41f5f12752f8 -> c8ffec048a3b (branchpoint), add fields to dag
03bc53e68815 -> 41f5f12752f8 (branchpoint), add superuser field
0a2a5b66e19d, bf00311e1990 -> 03bc53e68815 (mergepoint), merge_heads_2
9635ae0956e7 -> 0a2a5b66e19d, add task_reschedule table
dd25f486b8ea -> bf00311e1990, add index to taskinstance
9635ae0956e7 -> dd25f486b8ea, add idx_log_dag
856955da8476 -> 9635ae0956e7 (branchpoint), index-faskfail
f23433877c24 -> 856955da8476, fix sqlite foreign key
05f30312d566 -> f23433877c24, fix mysql not null constraint
86770d1215c0, 0e2a74e0fc9f -> 05f30312d566 (mergepoint), merge heads
27c6a30d7c24 -> 86770d1215c0, add kubernetes scheduler uniqueness
33ae817a1ff4 -> 27c6a30d7c24, kubernetes_resource_checkpointing
d2ae31099d61 -> 33ae817a1ff4, kubernetes_resource_checkpointing
d2ae31099d61 -> 0e2a74e0fc9f, Add time zone awareness
947454bf1dff -> d2ae31099d61 (branchpoint), Increase text size for MySQL (not relevant for other DBs' text types)
bdaa763e6c56 -> 947454bf1dff, add ti job_id index
cc1e65623dc7 -> bdaa763e6c56, Make xcom value column a large binary
127d2bf2dfa7 -> cc1e65623dc7, add max tries column to task instance
5e7d17757c7a -> 127d2bf2dfa7, Add dag_id/state index on dag_run table
8504051e801b -> 5e7d17757c7a, add pid field to TaskInstance
4addfa1236f1 -> 8504051e801b, xcom dag task indices
f2ca10b85618 -> 4addfa1236f1, Add fractional seconds to mysql tables
64de9cddf6c9 -> f2ca10b85618, add dag_stats table
211e584da130 -> 64de9cddf6c9, add task fails journal table
2e82aab8ef20 -> 211e584da130, add TI state index
1968acfc09e3 -> 2e82aab8ef20, rename user table
bba5a7cfc896 -> 1968acfc09e3, add is_encrypted column to variable table
bbc73705a13e -> bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field in connection
4446e08588 -> bbc73705a13e, Add notification_sent column to sla_miss
561833c1c74b -> 4446e08588, dagrun start end
40e67319e3a9 -> 561833c1c74b, add password column to user
2e541a1dcfed -> 40e67319e3a9, dagrun_config
1b38cef5b76e -> 2e541a1dcfed, task_duration
502898887f84 -> 1b38cef5b76e, add dagrun
52d714495f0 -> 502898887f84, Adding extra to Log
338e90f54d61 -> 52d714495f0, job_id indices
13eb55f81627 -> 338e90f54d61, More logging into task_instance
1507a7289a2f -> 13eb55f81627, maintain history for compatibility with earlier migrations
e3a246e0dc1 -> 1507a7289a2f, create is_encrypted
<base> -> e3a246e0dc1, current schema

After:

74effc47d867 -> 004c1210f153 (head), increase queue name size limit
b3b105409875 -> 74effc47d867, change datetime to datetime2(6) on MSSQL tables
d38e04c12aa2 -> b3b105409875, add root_dag_id to DAG
Revision ID: b3b105409875
Revises: d38e04c12aa2
Create Date: 2019-09-28 23:20:01.744775
6e96a59344a4 -> d38e04c12aa2, add serialized_dag table
939bb1e647c8 -> 6e96a59344a4, Make TaskInstance.pool not nullable
4ebbffe0a39a -> 939bb1e647c8, task reschedule fk on cascade delete
dd4ecb8fbee3, cf5dc11e79ad, a56c9515abdc -> 4ebbffe0a39a (mergepoint), Merge heads
c8ffec048a3b -> dd4ecb8fbee3, Add schedule interval to dag
41f5f12752f8 -> cf5dc11e79ad, drop_user_and_chart
c8ffec048a3b -> a56c9515abdc, Remove dag_stat table
41f5f12752f8 -> c8ffec048a3b (branchpoint), add fields to dag
03bc53e68815 -> 41f5f12752f8 (branchpoint), add superuser field
0a2a5b66e19d, bf00311e1990 -> 03bc53e68815 (mergepoint), merge_heads_2
9635ae0956e7 -> 0a2a5b66e19d, add task_reschedule table
dd25f486b8ea -> bf00311e1990, add index to taskinstance
9635ae0956e7 -> dd25f486b8ea, add idx_log_dag
856955da8476 -> 9635ae0956e7 (branchpoint), index-faskfail
f23433877c24 -> 856955da8476, fix sqlite foreign key
05f30312d566 -> f23433877c24, fix mysql not null constraint
86770d1215c0, 0e2a74e0fc9f -> 05f30312d566 (mergepoint), merge heads
27c6a30d7c24 -> 86770d1215c0, add kubernetes scheduler uniqueness
33ae817a1ff4 -> 27c6a30d7c24, kubernetes_resource_checkpointing
d2ae31099d61 -> 33ae817a1ff4, kubernetes_resource_checkpointing
d2ae31099d61 -> 0e2a74e0fc9f, Add time zone awareness
947454bf1dff -> d2ae31099d61 (branchpoint), Increase text size for MySQL (not relevant for other DBs' text types)
bdaa763e6c56 -> 947454bf1dff, add ti job_id index
cc1e65623dc7 -> bdaa763e6c56, Make xcom value column a large binary
127d2bf2dfa7 -> cc1e65623dc7, add max tries column to task instance
5e7d17757c7a -> 127d2bf2dfa7, Add dag_id/state index on dag_run table
8504051e801b -> 5e7d17757c7a, add pid field to TaskInstance
4addfa1236f1 -> 8504051e801b, xcom dag task indices
f2ca10b85618 -> 4addfa1236f1, Add fractional seconds to mysql tables
64de9cddf6c9 -> f2ca10b85618, add dag_stats table
211e584da130 -> 64de9cddf6c9, add task fails journal table
2e82aab8ef20 -> 211e584da130, add TI state index
1968acfc09e3 -> 2e82aab8ef20, rename user table
bba5a7cfc896 -> 1968acfc09e3, add is_encrypted column to variable table
bbc73705a13e -> bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field in connection
4446e08588 -> bbc73705a13e, Add notification_sent column to sla_miss
561833c1c74b -> 4446e08588, dagrun start end
40e67319e3a9 -> 561833c1c74b, add password column to user
2e541a1dcfed -> 40e67319e3a9, dagrun_config
1b38cef5b76e -> 2e541a1dcfed, task_duration
502898887f84 -> 1b38cef5b76e, add dagrun
52d714495f0 -> 502898887f84, Adding extra to Log
338e90f54d61 -> 52d714495f0, job_id indices
13eb55f81627 -> 338e90f54d61, More logging into task_instance
1507a7289a2f -> 13eb55f81627, maintain history for compatibility with earlier migrations
e3a246e0dc1 -> 1507a7289a2f, create is_encrypted
<base> -> e3a246e0dc1, current schema

This reverts the work of #6362 and just makes the migrations linear instead of having branches.

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"
    • https://issues.apache.org/jira/browse/AIRFLOW-XXX
    • In case you are fixing a typo in the documentation you can prepend your commit with [AIRFLOW-XXX], code changes always need a Jira issue.
    • In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal (AIP).
    • In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@Fokko
Copy link
Contributor Author

Fokko commented Oct 27, 2019

PTAL @dstandish

@Fokko Fokko requested a review from kaxil October 27, 2019 08:51
@bolkedebruin
Copy link
Contributor

Sorry @Fokko nitpicking: can you ensure the reason for the change is in the commit message? And is an empty upgrade really the way to fix this? It looks... Dirty. Was AIP-24 already in a release?

@Fokko Fokko force-pushed the fd-merge-alembic-heads branch from cd439a1 to 7e5018b Compare October 27, 2019 09:28
@Fokko
Copy link
Contributor Author

Fokko commented Oct 27, 2019

@bolkedebruin Thanks. As we say in Dutch: feedback is een cadeautje. I've updated @kaxil's migration to add the other head as well, which also solved the issue. I've updated commit and the description.

@Fokko
Copy link
Contributor Author

Fokko commented Oct 27, 2019

Just to check, @kaxil @ashb is AIP24 part of 1.10.6?

@codecov-io
Copy link

codecov-io commented Oct 27, 2019

Codecov Report

Merging #6442 into master will decrease coverage by 0.17%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6442      +/-   ##
==========================================
- Coverage   83.85%   83.67%   -0.18%     
==========================================
  Files         627      627              
  Lines       36537    36537              
==========================================
- Hits        30639    30574      -65     
- Misses       5898     5963      +65
Impacted Files Coverage Δ
airflow/kubernetes/volume_mount.py 44.44% <0%> (-55.56%) ⬇️
airflow/kubernetes/volume.py 52.94% <0%> (-47.06%) ⬇️
airflow/kubernetes/pod_launcher.py 45.25% <0%> (-46.72%) ⬇️
airflow/kubernetes/kube_client.py 33.33% <0%> (-41.67%) ⬇️
...rflow/contrib/operators/kubernetes_pod_operator.py 70.14% <0%> (-28.36%) ⬇️
airflow/jobs/scheduler_job.py 74.92% <0%> (+1.2%) ⬆️
airflow/utils/dag_processing.py 58.15% <0%> (+1.97%) ⬆️
airflow/executors/__init__.py 67.34% <0%> (+4.08%) ⬆️
airflow/jobs/local_task_job.py 90% <0%> (+5%) ⬆️
airflow/utils/sqlalchemy.py 93.22% <0%> (+6.77%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2caa45...a3fda6b. Read the comment docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha I didnt even know that is possible :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can't we adjust the head of AIP-24 btw? That would not make the double reference required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the head of AIP-24, then we detach another head.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alembic migrations are a DAG as well :-)

@bolkedebruin
Copy link
Contributor

I would leave the result out of the commit message (the 'before' 'after' stuff, I dont consider that required)

@kaxil
Copy link
Member

kaxil commented Oct 27, 2019

AIP-24 is not part of 1.10.6 so updating head should be fine :)

@feluelle feluelle added the area:MetaDB Meta Database related issues. label Oct 27, 2019
Remove branchpoints to make it lineair again.
@Fokko Fokko force-pushed the fd-merge-alembic-heads branch from 7e5018b to a3fda6b Compare October 27, 2019 14:20
@Fokko
Copy link
Contributor Author

Fokko commented Oct 27, 2019

Updated the earlier unreleased migrations to get everything linear again. So we don't have to merge again from any branchpoints.

@Fokko Fokko changed the title [AIRFLOW-5771] Merge alembic heads [AIRFLOW-5771] Straighten out alembic heads Oct 27, 2019
@Fokko Fokko changed the title [AIRFLOW-5771] Straighten out alembic heads [AIRFLOW-5771] Straighten out alembic migrations Oct 27, 2019
@ashb
Copy link
Member

ashb commented Oct 27, 2019

I'm not quite sure when we might actually want to have multiple heads? If there is never a case Airflow would want this then is it worth adding a unit test (like we have to ensure model and migrations are in sync) to ensure there is just one head?

@Fokko
Copy link
Contributor Author

Fokko commented Oct 27, 2019

We never want to have multiple heads. But sometimes it happens when someone works on a (long running) feature, and from a single revision, two migrations are being branched. For simplicity, you never want to have multiple heads I'd say.

@dstandish suggested adding the following unit test: https://blog.jerrycodes.com/multiple-heads-in-alembic-migrations/. This one will detect when we have multiple heads, and fail.

@kaxil
Copy link
Member

kaxil commented Oct 27, 2019

Yes we should have that unit test and this is going to be a common thing as we would all be now working on big 2.0 features

@Fokko
Copy link
Contributor Author

Fokko commented Oct 27, 2019

Dutch idiom: I don't want to mow the grass in front of @dstandish 's feet. As in, I would like to give him the opportunity to add the test himself :-)

@kaxil
Copy link
Member

kaxil commented Oct 27, 2019

Haha, I like the idiom

@dstandish
Copy link
Contributor

Haha ok I'll add 😊

@dstandish
Copy link
Contributor

alembic multiple heads test here: #6449
confirmed fails on master and passes on #6442 (straighten out migrations PR)

Copy link
Member

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am also for having tests there to check for multiple heads.

P.S. Great blog post btw.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am happy to cherry-pick it to 1.10.* for 1.10.7 possibly :)

@Fokko Fokko merged commit 171e1bb into apache:master Oct 27, 2019
@Fokko Fokko deleted the fd-merge-alembic-heads branch October 27, 2019 22:17
@Fokko
Copy link
Contributor Author

Fokko commented Oct 27, 2019

Thanks @potiuk and @feluelle for the review. I think we should CP this one and #6449 onto the 1.10.x branch. If you feel like doing it @potiuk, feel free, but it might be tricky since the hashes are different over there. It might be better to just resolve it over there based on the output of alembic history. I can do it as well if you like, but you're free to do it as well :-)

@potiuk
Copy link
Member

potiuk commented Oct 28, 2019

Yeah. I will cherry-pick (or redo ;) ) both PRs. I am now working on cherry-picking kind-related changes after 1.10.6 is released so I will add those two.

@mattinbits
Copy link
Contributor

Doesn't rewriting the lineage in this way create a problem for anyone who has already deployed a released version which contains the multiple heads, such as 1.10.5? Multiple heads leads to multiple rows in the alembic_version table and I'm not sure it's valid to just remove branches in this way. Isn't the correct way to create a new migration which merges the branches?

@mattinbits
Copy link
Contributor

Further - just tried doing an upgrade to 1.10.6 followed by an upgrade to master:
git checkout 1.10.6

alembic upgrade heads
git checkout master
alembic upgrade heads

Result is

ERROR [alembic.util.messaging] Requested revision 74effc47d867 overlaps with other requested 
revisions 004c1210f153
  FAILED: Requested revision 74effc47d867 overlaps with other requested revisions 004c1210f153

I think this demonstrates the issue.

@kaxil
Copy link
Member

kaxil commented Nov 22, 2019

@mattinbits Yes, you are right. This PR needs to sync up with what we have in v1-10-test branch too.

Can you please test if you can upgrade from 1.10.6 to 1.10.7 successfully?

@mattinbits
Copy link
Contributor

I don't find 1.10.7, does it have a tag?

@kaxil
Copy link
Member

kaxil commented Nov 22, 2019

I don't find 1.10.7, does it have a tag?

Whoops, I mean https://github.com/apache/airflow/tree/v1-10-test branch

@mattinbits
Copy link
Contributor

I get

INFO  [alembic.runtime.migration] Running upgrade a56c9515abdc, 004c1210f153, 74effc47d867, b3b105409875 -> 08364691d074, Merge the four heads back together

So it looks good

@kaxil
Copy link
Member

kaxil commented Nov 22, 2019

I get

INFO  [alembic.runtime.migration] Running upgrade a56c9515abdc, 004c1210f153, 74effc47d867, b3b105409875 -> 08364691d074, Merge the four heads back together

So it looks good

Thanks, I will check and fix this on master, thanks @mattinbits .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:MetaDB Meta Database related issues.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants