Skip to content

Conversation

@dstandish
Copy link
Contributor

@dstandish dstandish commented Jun 10, 2023

Ok I resurrected this PR from the dustbin @eladkal & @potiuk.

What this does is add two standardized "todo" directives that will allow us to essentially set a reminder to make certain code changes at provider release time.

It adds a new section to the github issue template:

Found providers with todos which should be addressed now. There are two classes of such todos. One is when the provider version is nowgreater than that specified in comment. The other is when the
current min airflow version for the provider is greaterthan specified in the comment.

## apache-airflow-providers-elasticsearch
### remove-on-provider-version
  - [ ] airflow/providers/elasticsearch/log/es_task_handler.py:169 (@dstandish)

This is achieved with two different todo directives:

One of them is about min airflow version:

# todo: remove-on-min-airflow-version=2.6 author=dstandish

When you add this, you'll get tagged when the new release bumps the min airflow version to 2.6 in that provider.

The other one is about provider version:

# todo: remove-on-provider-version=7.0 author=dstandish

With this one, I'll get notified at release time when provider version is bumped to 7.0.

This feels like a relatively simple enhancement that could help us remember to remove things. And it is responsive to the concern that provider manager should not be blocked if the person who added the tag doesn't respond.

The "author" element is optional.

Let me know what you think

@potiuk
Copy link
Member

potiuk commented Jun 11, 2023

Looks cool.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jul 27, 2023
@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jul 30, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Sep 14, 2023
@github-actions github-actions bot closed this Sep 19, 2023
@dstandish dstandish reopened this Nov 21, 2023
@dstandish dstandish force-pushed the automated-deprecatino-removal-process branch 2 times, most recently from 8a65a7c to b3cff05 Compare November 21, 2023 17:30
@dstandish dstandish removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 21, 2023
@potiuk
Copy link
Member

potiuk commented Nov 21, 2023

What this does is add two standardized "todo" directives that will allow us to essentially set a reminder to make certain code changes at provider release time.

Yes. I love the idea - it's much better than the automated tests you toyed with before.

I think one thing to add is that it should be somewhat flaggable (not sure how yet) to and converted to a TODO list for people to pick while preparing the release. I think having it as part of release command is a bit late and puts a lot of effort on the shoulders of release manager.

Maybe simply a task that produces an issue in Github with all those things to fix that release manager could prepare and share with others at the moment the preparation start - similar to our "Status" issues when we release? It's ratehr easy command to prepare such issue similarly to the release ones based on those todos: (and with your author's comments it could even be automatically added as @dstandish mention (similarly as we do in the "status" issues).

@dstandish
Copy link
Contributor Author

I think one thing to add is that it should be somewhat flaggable (not sure how yet) to and converted to a TODO list for people to pick while preparing the release. I think having it as part of release command is a bit late and puts a lot of effort on the shoulders of release manager.

Maybe simply a task that produces an issue in Github with all those things to fix that release manager could prepare and share with others at the moment the preparation start - similar to our "Status" issues when we release? It's ratehr easy command to prepare such issue similarly to the release ones based on those todos: (and with your author's comments it could even be automatically added as @dstandish mention (similarly as we do in the "status" issues).

When you say "task" do you mean add a new breeze command that generates a "todo" issue?

I realize that the release status page is a bit late for this... but I was thinking it would be a reasonable first step as we try this out since it woulld not add any more steps for release manager.... except when someone responds and does these fixes you'd have another RC. Also, I'm not sure that the versions get updated in the code until this time, so we don't know for sure what the version will be, which is an important part of this.

@potiuk
Copy link
Member

potiuk commented Nov 29, 2023

When you say "task" do you mean add a new breeze command that generates a "todo" issue?

Yes. task to run with breeze. should be easy to add (I can do it) - but we do not have to do it "now" it's more of the process and "who" is doing it I am concerned about. With things like that, you have to consider how much overhead it introduces to someone who is going to remove it , and who that "someone" will be. And even if we do not implement all the tooling that is needed "now" - we should know what is the goal and what is missing to get there.

Adding TODO:s in this way is the first part of the process "how we mark it" but we should also at least decide on what happens with it - i.e., how we remove it. Otherwise it's like a data that you keep by a company but have no idea how to use it. Typical Big-data projects companies have.

I think we need to really start with "how we envision the process of removing the TODOs". The problem with the current proposal is the "moment in time" when those deprecations are collected and explained. What you've proposed in this PR - this collection happens AFTER the documentation is updated, packages (RC1) are already preparead and signed and commited to SVN and release manager is about to send "VOTE" email.

So the "breeze task" that you put it in is just bad idea - because at that moment release manager has already done their job and if all is well, the RCs will be accepted as the new release.

Maybe you have not realized that and that's why you put it there :).

My point is that - we need to design "when" in the whole process this information should be collected and how to surface it to people who can do "something" about it - before the new provider's release is actually done.

IMHO. the process should look like:

  1. RM prepares a new wave and prepares documentation PR (changelog/commits)- also decides that some of the providers are going to have breaking change

  2. The PR is there, and is going to be iterated for a few days (usualy happens) when others are rushing with some "last minute" merges. I think HERE is the right moment to automatically generate an issue with information "we are releasing new breaking providers : A, B, C and for each provider we have a list of things to be fixed (i.e. deprecations removed).

During iteration we are likely regenerate the info to track progress of what's still left to do from deprecations.

  1. RM merges the doc PR (after merging last fixes and regenerating the changelog/commits documentation including those last minutes changes.

  2. Send information to VOTE on the providers <-- CURRENT implementation only generates the "pending deprecations" here.

We cannot really put the burden of fixes on the shoulders of RM - we need to give RM some tooling to be able to identify and engage (ideally semi-automatically rather than individuallly finding and pinging the people) those who should fix the deprecations. RM is really supposed to the "mechanics" of release, generally speaking except committing the changelog, and maybe cherry-picking commits (in case of Airflow) RM should not "implement" anything - RM can at most coordinate others implementing PRs (as a general rule).

So IMHO we need to do something to a) generate b) track what's left between 2) and 3) and not as part of 4) as is currently. and seems that having a task to do that in breeze and adding it as part of release process

@dstandish dstandish force-pushed the automated-deprecatino-removal-process branch from 32feafb to ee1a483 Compare January 13, 2024 05:31
@dstandish
Copy link
Contributor Author

@potiuk i think the current PR generates adds the todos to the docs PR -- not the Vote message -- can you double check? from my reading of your message, it's seems that is where you want it too?

@potiuk
Copy link
Member

potiuk commented Jan 14, 2024

@potiuk i think the current PR generates adds the todos to the docs PR -- not the Vote message -- can you double check? from my reading of your message, it's seems that is where you want it too?

Nope. It adds the TODOs at the moment where "issue generation" happens. Issue generation happens when the Release candidates are already in SVN and they are about to be announced.

This is https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#prepare-issue-in-github-to-keep-status-of-testing

The process looks like follows:

  1. RM generates PR with CHANGELOG and version bump for all providers that had merged changes since their last release
  2. This PR can be rebased and updated as the RM makes a "last call" - hey are there any PRs that should be merged
  3. Once this PR is rebased one last time it gets merged (after checking that no new changes were merged)
  4. At this poing RM creates an RC tag and builds packages (from the merged changelog commit)
  5. The next step is that RM creates an issue that summarizes all the issues/prs/changes and ask people to test it
  6. Then RM sends VOTE email with link to that issue
  7. After few days when things get approved, all providers that were not -1-ed are relased from that commit from point 3)
  8. All the -1-ed providers are moved to next wave (which might be accelerated - only the -1ed providers with bugfixes, or regular one - where all providers that got changes after 3) are included -> this is something I added recently

So what you added is happening at point 5) -> TODOS generated here are far to late to do anything about it, because the issue is about TESTING already released RCs

What I think your change could do is to have a NEW issue (fix those deprecations) happening either after 1) or even before 1) is even attempted.

@potiuk
Copy link
Member

potiuk commented Jan 14, 2024

And just to clarify 1) is done using breeze release-management prepare-providers-documentation command :)

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Feb 29, 2024
@github-actions github-actions bot closed this Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:providers stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants