[BD-21] Document feature toggles #24815

regisb · 2020-08-19T20:39:07Z

In this PR, I progressively document feature toggles one application at a time, starting at the top of a list of high-priority applications.

To detect feature toggles, I grep the application code as follows:

git grep --show-function -E "FEATURES|Waffle|Config|Switch" -- ./path/to/the/application

Although I generally have not too much trouble figuring out how feature flags are used, there are some annotation fields for which I do not quite understand how they should be used:

.. toggle_use_cases: should be one or more of incremental_release, launch_date, monitored_rollout, graceful_degradation, beta_testing, vip, opt_out, opt_in, open_edx. I have no idea what these values mean, so I just blindly set this annotation to "open_edx".
.. toggle_category: what should we use for this field? Should this be the name of the django app in which the feature toggle is defined (this is what I have done until now).

~~cc @robrap: no need for a thorough review right now. Maybe just check that I'm headed in the right direction?~~

I'll list here remaining questions I have about the feature toggles:

videos.enable_devstack_video_uploads: I did not quite understand the point of this feature toggle. My interpretation is that short-lived authentication tokens are better suited for the devstack, but I have trouble understanding why.
lms/djangoapps/grades/config/waffle.py: what are the use cases for these flags? (see this comment)

openedx-webhooks · 2020-08-19T20:39:13Z

Thanks for the pull request, @regisb! I've created BLENDED-554 to keep track of it in Jira. More details are on the BD-21 project page.

When this pull request is ready, tag your edX technical lead.

robrap · 2020-08-19T22:07:25Z

@regisb: Some quick comments:

Seems you like the """ style better than #. Can you comment here on your preference so we can determine a best practice.
We still need to determine what the proper annotations are. I don't think we have it right quite yet.
For use cases, see: https://open-edx-proposals.readthedocs.io/en/latest/oep-0017-bp-feature-toggles.html#use-cases This is meant to be multi-value (as it currently stands). I'm not convinced it is required, and couldn't just be part of the description. What I think we do want to know is something like "permanent" vs "temporary", which might be much simpler. But, @nasthagiri made the original recommendation, so she may want to weigh in on this.
I would propose we drop category altogether. I think the path is good enough. As a future idea, we could potentially provide an additional index based on paths, or pull out the part of the path in the doc to make it stand out as a pseudo-category.

Other thoughts:

toggle_warnings should go away, and just be a part of description.
toggle_creation_date, toggle_expiration_date, toggle_tickets, and toggle_status (and some options of toggle_use_cases) are mostly to help us determine what toggles can be retired. Maybe we could propose something simpler?

regisb · 2020-08-21T07:44:17Z

Seems you like the """ style better than #. Can you comment here on your preference so we can determine a best practice.

""" annotations allow for multi-line comments, but single-line annotations (prefixed by #) don't. I prefer to use multi-line comments as an incentive to write longer descriptions.

EDIT: turns out that multi-line comments are not supported when added right in the middle of dict entries, so we'll have to revert to long, single-line comments.

For use cases, see: https://open-edx-proposals.readthedocs.io/en/latest/oep-0017-bp-feature-toggles.html#use-cases This is meant to be multi-value (as it currently stands). I'm not convinced it is required, and couldn't just be part of the description. What I think we do want to know is something like "permanent" vs "temporary", which might be much simpler. But, @nasthagiri made the original recommendation, so she may want to weigh in on this.

Right. When commenting existing feature toggles, I have no way of discovering for what use case they were designed, so I'll just set this field to "open edx".

I would propose we drop category altogether. I think the path is good enough. As a future idea, we could potentially provide an additional index based on paths, or pull out the part of the path in the doc to make it stand out as a pseudo-category.

Sounds reasonable.

toggle_warnings should go away, and just be a part of description.

I am working on a code-annotations pull request that uses the toggle_warnings field to add a "warning" field to the generated docs. It's a fairly simple change and the visual impact is quite good (IMHO):

toggle_creation_date, toggle_expiration_date, toggle_tickets, and toggle_status (and some options of toggle_use_cases) are mostly to help us determine what toggles can be retired. Maybe we could propose something simpler?

Sure. @nasthagiri are you open to modifications of the current feature toggle annotation schema? If yes, could we schedule a conversation with the product owners?

robrap · 2020-08-21T13:58:07Z

@regisb:

I know I did the review, but I am surprised to learn that # (or any style comments) don't support multi-line annotations. Can we fix that?
We should somehow keep track of any open questions you have per toggle. Maybe even include it as text in the description. Like use_cases, etc.
Let's drop category.
Let's leave Warning if you find it useful.

regisb · 2020-08-21T14:34:05Z

@robrap

I know I did the review, but I am surprised to learn that # (or any style comments) don't support multi-line annotations. Can we fix that?

It's difficult. In code-annotations, single line annotations are parsed line-by-line: https://github.com/edx/code-annotations/blob/97cd3397c29a492929fb8902eec40400206253b7/code_annotations/extensions/base.py#L60
I can look into this, but right now I don't see an elegant way to solve this. Sorry for not being clear about this earlier in the pull request :-/

robrap · 2020-08-21T15:20:44Z

@regisb:

Absolute minimum, we should update the docs to clarify this limitation.
Top-of-my head thought, is there any way to do the parsing in two steps where we strip the comments before doing a more in-depth processing? I guess I'd have to look at the code to understand where it breaks down when the comments are still there, and how that could be fixed. I can't get to that until later.
Just note that the most important part of your documenting might be the data gathering part, because I'd much prefer these got left as # comments.

nasthagiri · 2020-08-23T23:46:54Z

@regisb and @robrap Yes, I am open to modifications to the current scheme definition. Please write up your proposal for the change, to give me an idea of what you are thinking. The proposal could be a PR to amend the OEP.

If you'd like set up a 15-30mn meeting to hash it out, you can do so at https://calendly.com/nasthagiri

Thanks!

regisb · 2020-08-25T17:11:25Z

Please write up your proposal for the change, to give me an idea of what you are thinking.

I'll repost what I just wrote on Slack: https://app.slack.com/client/T02SNA1T6/threads/thread/C015XNCQDDG-1597777765.002700
What I have in mind is that the toggle_use_cases should not be used anymore, as the use case depends on which platform we are considering: some platforms may enable a feature in monitored rollout while others may enable it at a specific launch date https://open-edx-proposals.readthedocs.io/en/latest/oep-0017-bp-feature-toggles.html#use-cases Thus, it does not make sense to set in stone the use cases of a feature toggle.
I think I will also mostly stop using the toggle_category entries: this field is already considered optional by OEP-17. I think I'll want to use it to group together feature toggles that obviously belong to the same category, such as ORA2 for instance.

Thus, my PR on OEP 17 would get rid of the toggle_use_cases field. Would that make sense @nasthagiri? If you disagree, then I am not going to insist very much on pushing that change, as it's mostly edX who has interest in this particular annotation. If you think this topic mandates a discussion then I'll schedule a call.

nasthagiri · 2020-08-26T23:24:26Z

@regisb I spoke to @robrap about this today.

I agree with the removal of the toggle_category field.
Let's add a new field to specify whether the toggle is temporary or permanent. For forward compatibility, use a non-boolean field such as toggle_life_expectancy. For now, the values can be temporary and permanent.
If toggle_life_expectancy is present, then expect to have a toggle_use_case value to explain why this permanent toggle needs to continue to be in the codebase (adding to our long-term code complexity).

#3 is important for us since we need to hold our teams accountable for removing old toggles. Allowing them to just mark a toggle as permanent (if we only do #2) can be too easy and won't meet our objectives. Having them reference OEP-17 to look up (and update if needed) the permanent type of use cases helps them be intentional in their design process.

Also, to give you an idea of the "expected timelines" for the different use cases, here's what I had sent to @elhoyos several months ago:

Use Case 1: Incremental Release (expected timeline: 3 months)
Use Case 2: Launch Date (expected timeline: 3 months)
Use Case 3: Ops - Monitored Rollout (expected timeline: 3 months)
Use Case 4: Ops - Graceful Degradation (expected timeline: 5 years)
Use Case 5: Beta Testing (expected timeline: 6 months)
Use Case 6: VIP / White Label (expected timeline: 5 years)
Use Case 7: Opt-out (expected timeline: 2 years)
Use Case 8: Open edX option (expected timeline: 3 years)

lms/djangoapps/course_api/blocks/toggles.py

regisb · 2020-08-31T13:28:55Z

@nasthagiri:

I agree with the removal of the toggle_category field.
Let's add a new field to specify whether the toggle is temporary or permanent. For forward compatibility, use a non-boolean field such as toggle_life_expectancy. For now, the values can be temporary and permanent.

👌
Should I write an ADR for this?

If toggle_life_expectancy is present, then expect to have a toggle_use_case value to explain why this permanent toggle needs to continue to be in the codebase (adding to our long-term code complexity).

Should toggle_use_case be added also when toggle_life_expectancy=temporary, or only when toggle_life_expectancy=permanent?

Unfortunately there is currently no mechanism to enforce the presence of a field conditionally on another: https://code-annotations.readthedocs.io/en/latest/configuration.html
However, what I can do is patch the feature toggles Sphinx extension that generates the human-readable docs: that way, when someone adds a feature toggle with toggle_life_expectancy=permanent but without toggle_use_case then a warning will be added to the docs, and possibly in the console as well.

nasthagiri · 2020-08-31T13:36:47Z

Should I write an ADR for this?

Yes. Either an ADR or update the OEP if the OEP itself is now invalid.

Should toggle_use_case be added also when toggle_life_expectancy=temporary, or only when toggle_life_expectancy=permanent?

I'd like to require toggle_use_case when toggle_life_expectancy=permanent. Otherwise, it can be optional.

warning will be added to the docs, and possibly in the console as well

When would this issue be detected? Do we generate the doc within the CI process of the PR that introduced the toggle? Or would this be in a separate CI process?

I ask since I'm wondering how tight the feedback loop would be for the developers to correct their toggle annotations.

regisb · 2020-08-31T15:54:27Z

When would this issue be detected? Do we generate the doc within the CI process of the PR that introduced the toggle? Or would this be in a separate CI process?

If we want to have a very short feedback loop, I can write an ad-hoc linting script that will check the generated yaml feature toggle report. This script could even be moved to a dedicated unit test. But running this script/test would have to be done separately for each repository, assuming we add feature toggles to different repos. The alternative is that issues would only be detected once a human looks at the generated docs.

robrap · 2020-08-31T16:14:37Z

@regisb @nasthagiri: Some additional thoughts:

First, let me summarize what I think is temporary vs permanent based on the expected timeline and the OEP.

Temporary:

Use Case 1: Incremental Release (expected timeline: 3 months)
Use Case 2: Launch Date (expected timeline: 3 months)
Use Case 3: Ops - Monitored Rollout (expected timeline: 3 months)
Use Case 5: Beta Testing (expected timeline: 6 months)

(Semi-)Permanent:

Use Case 4: Ops - Graceful Degradation (expected timeline: 5 years)
Use Case 6: VIP / White Label (expected timeline: 5 years)
Use Case 7: Opt-out (expected timeline: 2 years)
Use Case 8: Open edX option (expected timeline: 3 years)

Also, based on reading the docs, I'd like to propose the following:

Proposed Renames:

"Ops - Graceful Degradation" => "Ops - Circuit Breaker": I think this name better matches the use case. I think of "graceful degradation" as something that is built in and probably wouldn't require a toggle, and feels more misleading.
"Opt-out" => "Opt-out/Opt-in": Just want to add "Opt-in" to the name to better match the options provided in the annotations.

I don't think code-annotations supports blanks for annotations with choices. Some possible options we have:

Add an allow_blank option to annotation configuration to allow people to skip toggle_use_cases for temporary use cases, and keep life expectancy as a new toggle.
Add temporary as an additional toggle_use_cases choice that someone can use in place of any temporary use-cases. The new list of choices would be temporary, incremental_release, launch_date, monitored_rollout, circuit_breaker, beta_testing, vip, opt_out, opt_in, open_edx.
Replace all temporary use_case choices with a single temporary choice. The new list of choices would be temporary, circuit_breaker, vip, opt_out, opt_in, open_edx

Note: the second and third options don't require separate linting to ensure permanent use cases are added, and don't require a code-annotations enhancement to support blanks with choices fields.

I'd vote for option 2, as long as our how_to documentation helps keep this simple. Otherwise, option 3 keeps the options simplest.

Also, for all options above, we could consider adding a prefix to the use cases, like temp_incremental_release, temp_launch_date, ..., permanent_circuit_breaker, permanent_vip, etc. The names are longer, but it helps more quickly categorize the use cases.

Additionally, we should make the following documentation updates:

The OEP could use the terms temporary and permanent or semi-permanent in the descriptions. The "Proposed Renames" above should also happen. Would it be helpful to add expected lifespans to the OEP?
We need a how_to for documenting toggles that provides example annotations, describes how to document temporary vs permanent toggles, and points to the OEP as needed for additional details of the use cases. This doc either needs to live in edx-toggles, or edx-toggles should point to this doc.
The feature_toggle_annotations.yaml should have a comment pointing to the how_to for usage.

nasthagiri · 2020-08-31T20:58:04Z

+1 to all of the above remarks.

robrap · 2020-08-31T21:15:22Z

@regisb: Do you feel comfortable starting a how_to PR in edx-toggles for annotating? We can hash out any remaining issues in the documentation, and then follow-up with changes to the yml and OEP (in parallel or serially as you see fit).

regisb · 2020-09-01T07:36:22Z

@robrap

Do you feel comfortable starting a how_to PR in edx-toggles for annotating? We can hash out any remaining issues in the documentation, and then follow-up with changes to the yml and OEP (in parallel or serially as you see fit).

Sure. I'll open the following PRs:

On edx-toggles to create a howto
On code-annotations to update the yaml configuration
On open-edx-proposals to amend OEP 17

regisb · 2020-09-01T12:46:02Z

@robrap I like your option 3:

Replace all temporary use_case choices with a single temporary choice. The new list of choices would be temporary, circuit_breaker, vip, opt_out, opt_in, open_edx

The reason why I like this approach is that:

developers who document a feature toggle might not have any idea how the toggle is going to be enabled/disabled by the ops team.
Also, the way it is deployed should not have an impact on the code.
Finally, non-edX contributors don't have a clue either how the toggle is going to be deployed in other enviironments.

If we go with this approach, I'd suggest getting rid of the toggle_life_expectancy entirely: after all, if toggle_use_case is not temporary, then it means that the life expectancy is permanent, right? It would spare us from adding allow_blank properties to the choice fields. @nasthagiri what do you think?

Sorry if my explanations are confusing -- maybe we can schedule a call to clarify things?

nasthagiri · 2020-09-01T13:29:01Z

@regisb Yes, completely agree. @robrap's option eliminates the need to have a separate toggle_life_expectancy.

This takes advantage of the new multiline annotation format with single-line comment prefix, from code-annotations.

Since code-annotations==0.7.0, this annotation is not used anymore.

Since code-annotations==0.7.0, incremental_release, launch_date, monitored_rollout, graceful_degradation, beta_testing are all considered as "temporary" use cases.

This annotation is deprecated since code-annotations==0.7.0

This is part of the changes brought by code-annotations==0.7.0

regisb · 2020-09-16T13:23:13Z

@feanil I rebased and removed the "WIP" label -- as far as I'm concerned, this is ready to merge.

feanil · 2020-09-17T15:30:31Z

Failing tests:

jenkins run python
jenkins run py38 js

edx-status-bot · 2020-09-17T15:51:25Z

Your PR has finished running tests. There were no failures.

feanil · 2020-09-17T17:45:02Z

PR for the flaky python test is here: https://github.com/edx/edx-platform/pull/25009

edx-pipeline-bot · 2020-09-17T18:42:24Z

EdX Release Notice: This PR has been deployed to the staging environment in preparation for a release to production.

edx-pipeline-bot · 2020-09-17T19:05:47Z

EdX Release Notice: This PR has been deployed to the production environment.

edx-pipeline-bot · 2020-09-17T19:14:05Z

EdX Release Notice: This PR has been deployed to the staging environment in preparation for a release to production.

edx-pipeline-bot · 2020-09-17T19:44:29Z

EdX Release Notice: This PR has been deployed to the production environment.

openedx-webhooks added blended PR is managed through 2U's blended developmnt program needs triage labels Aug 19, 2020

regisb force-pushed the regisb/document-feature-toggles branch from b0fe52e to 9e2cf7d Compare August 21, 2020 22:36

regisb mentioned this pull request Aug 27, 2020

[BD-21] Multiline annotations with single-line comment prefix ("#") openedx/code-annotations#48

Merged

10 tasks

robrap reviewed Aug 27, 2020

View reviewed changes

lms/djangoapps/course_api/blocks/toggles.py Outdated Show resolved Hide resolved

regisb force-pushed the regisb/document-feature-toggles branch 6 times, most recently from 41a744d to f7fd5d7 Compare August 27, 2020 16:02

regisb mentioned this pull request Sep 2, 2020

[BD-21] Simplify feature toggle annotation configuration openedx/code-annotations#49

Merged

11 tasks

regisb added 15 commits September 16, 2020 15:14

Document openedx/core/djangoapps/user_authn feature toggles

c8892d3

Document openedx/core/djangoapps/video_config feature toggles

8aa049b

Document openedx/core/djangoapps/video_pipeline feature toggles

39a3770

Document lms/djangoapps/course_api feature toggles

c8911b6

Document lms/djangoapps/courseware feature toggles

d34aa1c

Wrap toggle annotation lines with multiline comments

7dc460d

This takes advantage of the new multiline annotation format with single-line comment prefix, from code-annotations.

Document lms/djangoapps/grades feature toggles

9445bbe

Fix deprecated toggle annoation format

0c3bc12

Get rid of the toggle_category annotation, now deprecated

ab0e214

Since code-annotations==0.7.0, this annotation is not used anymore.

Simplify the toggle_use_case annotation

d1f9e76

Since code-annotations==0.7.0, incremental_release, launch_date, monitored_rollout, graceful_degradation, beta_testing are all considered as "temporary" use cases.

Remove deprecated toggle_status annotation

98a13d6

This annotation is deprecated since code-annotations==0.7.0

Rename toggle_expiration_date to toggle_target_removal_date

7d93715

This is part of the changes brought by code-annotations==0.7.0

Fix indentation of waffle flag annotation

e4fc7e8

Fix unused imports in video pipeline

fee9279

Clarify many feature toggle annotations across all applications

a4ba4ae

regisb force-pushed the regisb/document-feature-toggles branch from c8f77ca to a4ba4ae Compare September 16, 2020 13:20

regisb changed the title ~~[BD-21] Document feature toggles (WIP)~~ [BD-21] Document feature toggles Sep 16, 2020

feanil approved these changes Sep 17, 2020

View reviewed changes

feanil merged commit 90bc2ff into openedx:master Sep 17, 2020

openedx-webhooks added merged and removed needs triage labels Sep 17, 2020

regisb deleted the regisb/document-feature-toggles branch October 5, 2020 07:36

[BD-21] Document feature toggles #24815

[BD-21] Document feature toggles #24815

Uh oh!

Conversation

regisb commented Aug 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openedx-webhooks commented Aug 19, 2020

Uh oh!

robrap commented Aug 19, 2020

Uh oh!

regisb commented Aug 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robrap commented Aug 21, 2020

Uh oh!

regisb commented Aug 21, 2020

Uh oh!

robrap commented Aug 21, 2020

Uh oh!

nasthagiri commented Aug 23, 2020

Uh oh!

regisb commented Aug 25, 2020

Uh oh!

nasthagiri commented Aug 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

regisb commented Aug 31, 2020

Uh oh!

nasthagiri commented Aug 31, 2020

Uh oh!

regisb commented Aug 31, 2020

Uh oh!

robrap commented Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nasthagiri commented Aug 31, 2020

Uh oh!

robrap commented Aug 31, 2020

Uh oh!

regisb commented Sep 1, 2020

Uh oh!

regisb commented Sep 1, 2020

Uh oh!

nasthagiri commented Sep 1, 2020

Uh oh!

regisb commented Sep 16, 2020

Uh oh!

feanil commented Sep 17, 2020

Uh oh!

edx-status-bot commented Sep 17, 2020

Uh oh!

feanil commented Sep 17, 2020

Uh oh!

edx-pipeline-bot commented Sep 17, 2020

Uh oh!

edx-pipeline-bot commented Sep 17, 2020

Uh oh!

edx-pipeline-bot commented Sep 17, 2020

Uh oh!

edx-pipeline-bot commented Sep 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

regisb commented Aug 19, 2020 •

edited

Loading

regisb commented Aug 21, 2020 •

edited

Loading

nasthagiri commented Aug 26, 2020 •

edited

Loading

robrap commented Aug 31, 2020 •

edited

Loading