Skip to content
This repository was archived by the owner on Nov 8, 2024. It is now read-only.

[UFC] Add bandit to SDK#38

Merged
schmit merged 10 commits intomainfrom
sps/bandit
May 30, 2024
Merged

[UFC] Add bandit to SDK#38
schmit merged 10 commits intomainfrom
sps/bandit

Conversation

@schmit
Copy link

@schmit schmit commented May 15, 2024

Motivation and Context

Adds contextual bandits to the Python SDK

Description

How has this been tested?

  • Added unit tests
  • Added generic tests that pass for this SDK

@schmit schmit changed the title [UFC] Add bandit to SDK (wip) [UFC] Add bandit to SDK May 15, 2024
return BanditResult(variation, None)

# for now, assume that the variation is equal to the bandit key
bandit_data = self.__config_requestor.get_bandit_model(variation)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: here we implicitly assume that the variation is also the bandit key, and will look up the bandit without verifying that this bandit is attached to this particular feature flag.

This can lead to a weird edge case:

  • the variation is not a bandit for this flag
  • but it is a bandit for another flag
    then it will evaluate the bandit for the other flag

There are some workarounds, but haven't settled on something I feel good about yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an ok assumption for now. I look forward to the day customers have so many bandits at play this becomes an issue 📈

Sven Schmit added 2 commits May 28, 2024 16:15
Comment on lines +224 to +231
def get_bandit_action(
self,
flag_key: str,
subject_key: str,
subject_attributes: Attributes,
actions_with_contexts: List[ActionContext],
default: str,
) -> BanditResult:
Copy link
Author

@schmit schmit May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main function here, please make sure you like the choices I have made here:

  • adding a default argument (think of this as variation, not action!)
  • returning a BanditResult object that contains both variation and action (to cleanly separate whether the bandit has taken an action, or the choice is up to user)
  • The types Attributes and ActionContext

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine to try, and see what customers think.

As Mike Tyson says, "Everybody has a plan until they get punched in the face"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the to_string method so people can have both options

@schmit schmit marked this pull request as ready for review May 28, 2024 23:22
Comment on lines +12 to +13
def log_bandit_action(self, bandit_event: Dict):
pass
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aarsilv in particular: added the log_bandit_action method to the assignment logger for now, it seems like the simplest solution but happy to adjust

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine! I don't know python well but the main goal will be that only customers using bandits will need to implement / worry about this

Copy link

@giorgiomartini0 giorgiomartini0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Monumental effort! First pass; left a few comments and questions.

Copy link

@giorgiomartini0 giorgiomartini0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated logic looks good!

Copy link
Contributor

@aarsilv aarsilv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work bringing bandits to the Python SDK 💪 💪 💪

I left some comments--all minor--highlighting slight differences from the Java SDK for consideration. I imagine the SDKs will be a bit fluid as we spread bandits out and as customers start to use them (or, ideally, we dogfood them harder).

Comment on lines +12 to +13
def log_bandit_action(self, bandit_event: Dict):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine! I don't know python well but the main goal will be that only customers using bandits will need to implement / worry about this

bandit_model.coefficients[action_context.action_key],
)
if action_context.action_key in bandit_model.coefficients
else bandit_model.default_action_score
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

best_action, best_score = max(action_scores, key=lambda t: t[1])

# adjust probability floor for number of actions to control the sum
min_probability = probability_floor / number_of_actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way it's coded in Java is the probability floor is an absolute floor irrespective of the number of actions (src). Either way, it has its issues, but dividing it by the number of actions seems safer if you'd like that to become the standard.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup I think this is safer; otherwise there is no good way to set the probability floor generally. Suppose 1 bandit gets called with 1000s of actions, and the other one with 5; what probability floor should we be using if we don't normalize?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

min_probability,
1.0 / (number_of_actions + gamma * (best_score - score)),
),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In java, we'd do the extra step of rounding to the shard space (e.g., the closest ten thousandth) to keep weights consistent across programming languages that may have different decimal number implementations under the hood. (src)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I figure they all use the same standard for doubles right? and even if not the precision should be much better than 1/10000 -- going to leave as is for now but happy to adjust in a future PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you're right, all our core languages use IEEE 754 so should be good 🤞

]

# remaining weight goes to best action
remaining_weight = 1.0 - sum(weight for _, weight in weights)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the minimum probability floor, we may want to defensively bound this to 0.0 so we don't end up with a negative remaining weight (src).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great call!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note though that the weight of all non-optimal actions should be at most 1/K -- so there should always be 1/K weight left for the optimal action (as long as probability floor < 1)

Comment on lines +1 to +2
# Note: contains tests for client.py related to bandits to avoid
# making client_test.py too long.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this!

Comment on lines +38 to +42
def log_assignment(self, assignment_event: Dict):
print(f"Assignment Event: {assignment_event}")

def log_bandit_action(self, bandit_event: Dict):
print(f"Bandit Event: {bandit_event}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor but we may want one test that verifies the correct stuff gets passed as the vent

assignment_logger=AssignmentLogger(),
)
)
sleep(0.1) # wait for initialization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤨

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise the test for whether the client is initialized is run before initialization

Comment on lines +72 to +73
print(client.get_flag_keys())
print(client.get_bandit_keys())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still want these prints?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope 🙇

Comment on lines +120 to +121
@pytest.mark.parametrize("test_case", test_data)
def test_bandit_generic_test_cases(test_case):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💪 💪 💪

@schmit schmit assigned aarsilv and unassigned schmit May 30, 2024
@aarsilv aarsilv assigned schmit and unassigned aarsilv May 30, 2024
@schmit schmit merged commit 8b79f1f into main May 30, 2024
@schmit schmit deleted the sps/bandit branch May 30, 2024 05:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants