-
Notifications
You must be signed in to change notification settings - Fork 16.4k
New breeze command to publish docs #32495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ) | ||
| from airflow_breeze.utils.shared_options import get_dry_run, get_verbose, set_forced_answer | ||
| from airflow_breeze.utils.visuals import ASCIIART, ASCIIART_STYLE, CHEATSHEET, CHEATSHEET_STYLE | ||
| from docs.exts.docs_build.docs_builder import AirflowDocsBuilder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment: The thing here is that whatever we have here in "docs.exts", it needs to be copied to "breeze". Breeze should be largely self-containing (except occasional retrieval of the information from "airflow" sources, importing the code should always be done as
from airflow_breeze.....
And the code imported should live in dev/breeze/src/airflow_breeze folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(that's part of the problem to solve - to get rid of the "docs" folder that needs to be added on PYTHONPATH)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. Will re work this
| ) | ||
|
|
||
| # set AIRFLOW_SITE_DIR env variable | ||
| os.environ["AIRFLOW_SITE_DIR"] = airflow_site_directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another comment: You do not need to do that. the AIRLFOW_SITE_DIRECTORY variable should be defined in click option. Click option has envvar field and you can set env variable there. So we should revert the behaviour:
Instead of relying on AIRFLOW_SITE_DRECTORY in env, the directory can be set either by AIRLFOW_SITE_DIRECTORY or by --airflow-site-directory flag and then it should be passed directy to (modified) AirlfowDocsBuilder as parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW. It's OK To partially copy the code for now and have different "builder" in publish docs and in "build_docs.py" - we can later re-join them when we also get rid of the build_docs.py from the "docs" folder and move it to breeze (which will eventually happen most likely).
| os.environ["AIRFLOW_SITE_DIR"] = airflow_site_directory | ||
|
|
||
| available_packages = get_available_packages() | ||
| package_filters = package_filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a redundant reassignment. Maybe we can have package_filters in the function signature itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, will make the changes here
|
The easiest way to achieve the task was the migrate the file structures as-is with required tweaks. @potiuk can you take a look? I think the CI is definitely going to fail, need some mentoring on fixing that too :) |
|
For starters, I have been hitting this: Added it to requirements, but probably needs addition elsewhere too |
|
My speculation on this task is that we send in a skeleton first and then continually improve/refactor it |
|
You need to add requirements to The |
Quite the contrary, I think it will be ok to iterate until it's ready and we can use it. There is no big value in having a new command that cannot provide a good, usable output. And by doing that you can also learn about all the different pieces, so it makes sense to iterate and do it "right" rather than "fast". Also Breeze is integral part of the CI process and there are many pre-commits that guard the right things to be done there - for example they will make sure that any command added to breeze needs to be described what it does in BREEZE.rst, including automatically generated screenshot, so we cannot merge a "failing" PR which will skip those steps (and this is very deliberate choice). Any PR merged should contain:
This is the "complete" implementation of a new command, which I consider as necessary for the PR to be mergable. |
Yep. Looks right. I might want to take a closer look once the build is "green" :) |
Thanks for the point. Now, I agree with you. In this case, having a complete PR is more important than having parts and pieces going in. Working on it! |
|
I did add the I think I am still running into this issue: |
Nope. it seems fixed. Now unit tests are failing |
| @@ -0,0 +1,117 @@ | |||
| # Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be renamed to "docs_errors.py" I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah makes sense. Renaming
| @click.option( | ||
| "-s", "--airflow-site-directory", help="Local directory path of cloned airflow-site repo.", required=True | ||
| ) | ||
| @click.option( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be moved to "common_options" - same option is now used in build_docs so this can be made DRY
|
As part of this change, we should also (to make it really complete) to add a step of publishing docs to this CI job: airflow/.github/workflows/ci.yml Line 632 in 3f6ac2f
This way the new command will be automatically tested on CI and we will not have regressions at least as a "smoke test". This is the approach we have for pretty much all other commands in breeze - they are executed as part of the CI and this way we make sure they continue working. This CI job already built all the docs and we have them locally, so It should be done in few separate steps.
|
Awesome idea, let us integrate the CI to do this as well. I will try this portion out too! |
You will get full circle dev-env + CI :). They are really tightly integrated in our case and one helps to keep the other "working". |
|
@potiuk tried to handle all your comments. The CI passed for the first time in my dev env. Hopefully it will here too! |
|
I deleted the |
Just replace it with the new command the script is referred to in the dev/RELEASE_PROVIDERS* - as a way to pass list of providers and build |
This error is because the |
|
@potiuk just handled those comments 👍🏽 |
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks PERFECT!
|
🎉 🎉 🎉 🎉 🎉 🎉 |
|
Thanks for closely following up on this one with me @potiuk. I will definitely raise some refactors on this one soon |
| cd "${AIRFLOW_REPO_ROOT}" | ||
|
|
||
| ./docs/publish_docs.py \ | ||
| breeze release-management publish-docs \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the whole section above?
Now that the command in breeze we probably don't need the above section?
**NOTE** In order to run the publish documentation you need to activate virtualenv where you installed
apache-airflow with doc extra:
* `pip install '.[doc_gen]'`
If you don't have virtual env set you can do:
```shell script
cd <path_you_want_to_save_your_virtual_env>
virtualenv providers
source venv/providers/bin/activate
pip install 'apache-airflow[doc_gen]'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I don't think we'd need that anymore. I'll raise a PR soon to get it fixed. Thanks for noticing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool.
Please note that lines 432-434 also needs to be edited.
It's alternative approach to run the publishing command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah. I missed that we had a bit more details in PROVIDERS docs :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we need to edit lines 432-434. Those changes have been handled in this PR already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool. Please note that lines 432-434 also needs to be edited. It's alternative approach to run the publishing command
Yep. The bash script in question already uses the new command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why do we need bash script to begin with?
Its executing the same command with differnt kind of parameters to breeze should do the mitigation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a helper script converting list of ids which you already have (amazon google ... ) to list of flags (--package-filter apache-airlfow-providers-amazon --package-filter apache-airlfow-providers-google ....) .
If you prefer to do it manually fine, but I think having script is useful. We could of course add it to the command but --package-filter is traditionally there and serves other purposes too (not everything is a provider for one).
The apache#32495 changed the way to run docs - with using --for-production flag, which means that the docs are generated under "stable" link rather than "latest" - in order to test document publishing. We could change it back to latest, but the problem revealed that we do not have to actually make a distinction there. It's ok to keep it all as "stable" - no problem with that - there is no distinction between latest and stable other than different link and we never mix the two (our production docs have only stable), so it's ok to switch everything to stable. As a follow-up `--for-production` flag will be removed and we will switch everything to stable for all documentation building. For now fetching docs from stable link should fix the problem.
The #32495 changed the way to run docs - with using --for-production flag, which means that the docs are generated under "stable" link rather than "latest" - in order to test document publishing. We could change it back to latest, but the problem revealed that we do not have to actually make a distinction there. It's ok to keep it all as "stable" - no problem with that - there is no distinction between latest and stable other than different link and we never mix the two (our production docs have only stable), so it's ok to switch everything to stable. As a follow-up `--for-production` flag will be removed and we will switch everything to stable for all documentation building. For now fetching docs from stable link should fix the problem.
We had two types of documentation building: * latest * for production But in fact they never overlapped and we were never mixing the two We used the latest for all development work and for production for releasing to "airflow.apache.org". However as we saw in apache#32562 (triggered by apache#32495 changing the build in main to run with `--for-production` flag - we actually do not need the "latest" builds at all. Everything can be build with "for production" by default. This change removes the `--for-production` flag entirely, leaving the "for production" build mode as the only one available.
We had two types of documentation building: * latest * for production But in fact they never overlapped and we were never mixing the two We used the latest for all development work and for production for releasing to "airflow.apache.org". However as we saw in #32562 (triggered by #32495 changing the build in main to run with `--for-production` flag - we actually do not need the "latest" builds at all. Everything can be build with "for production" by default. This change removes the `--for-production` flag entirely, leaving the "for production" build mode as the only one available.
The apache#32495 mistakeny replaced the "build" script with a "release-management build-docs" command where there is no such command - there is just "build-docs"
The #32495 mistakeny replaced the "build" script with a "release-management build-docs" command where there is no such command - there is just "build-docs"
Building basic framework to build a doc for publishing docs in terms of a breeze command. For discussions see: #32491.
This PR attempts to build a simple framework that does the package copying to the airflow-site repo which is cloned locally:
What is pending:
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.