Conversation
| return BanditResult(variation, None) | ||
|
|
||
| # for now, assume that the variation is equal to the bandit key | ||
| bandit_data = self.__config_requestor.get_bandit_model(variation) |
There was a problem hiding this comment.
Note: here we implicitly assume that the variation is also the bandit key, and will look up the bandit without verifying that this bandit is attached to this particular feature flag.
This can lead to a weird edge case:
- the variation is not a bandit for this flag
- but it is a bandit for another flag
then it will evaluate the bandit for the other flag
There are some workarounds, but haven't settled on something I feel good about yet.
There was a problem hiding this comment.
I think this is an ok assumption for now. I look forward to the day customers have so many bandits at play this becomes an issue 📈
| def get_bandit_action( | ||
| self, | ||
| flag_key: str, | ||
| subject_key: str, | ||
| subject_attributes: Attributes, | ||
| actions_with_contexts: List[ActionContext], | ||
| default: str, | ||
| ) -> BanditResult: |
There was a problem hiding this comment.
Main function here, please make sure you like the choices I have made here:
- adding a default argument (think of this as variation, not action!)
- returning a BanditResult object that contains both variation and action (to cleanly separate whether the bandit has taken an action, or the choice is up to user)
- The types
AttributesandActionContext
There was a problem hiding this comment.
I think this is fine to try, and see what customers think.
As Mike Tyson says, "Everybody has a plan until they get punched in the face"
There was a problem hiding this comment.
Added the to_string method so people can have both options
| def log_bandit_action(self, bandit_event: Dict): | ||
| pass |
There was a problem hiding this comment.
@aarsilv in particular: added the log_bandit_action method to the assignment logger for now, it seems like the simplest solution but happy to adjust
There was a problem hiding this comment.
I think that's fine! I don't know python well but the main goal will be that only customers using bandits will need to implement / worry about this
giorgiomartini0
left a comment
There was a problem hiding this comment.
Monumental effort! First pass; left a few comments and questions.
aarsilv
left a comment
There was a problem hiding this comment.
Excellent work bringing bandits to the Python SDK 💪 💪 💪
I left some comments--all minor--highlighting slight differences from the Java SDK for consideration. I imagine the SDKs will be a bit fluid as we spread bandits out and as customers start to use them (or, ideally, we dogfood them harder).
| def log_bandit_action(self, bandit_event: Dict): | ||
| pass |
There was a problem hiding this comment.
I think that's fine! I don't know python well but the main goal will be that only customers using bandits will need to implement / worry about this
| bandit_model.coefficients[action_context.action_key], | ||
| ) | ||
| if action_context.action_key in bandit_model.coefficients | ||
| else bandit_model.default_action_score |
| best_action, best_score = max(action_scores, key=lambda t: t[1]) | ||
|
|
||
| # adjust probability floor for number of actions to control the sum | ||
| min_probability = probability_floor / number_of_actions |
There was a problem hiding this comment.
The way it's coded in Java is the probability floor is an absolute floor irrespective of the number of actions (src). Either way, it has its issues, but dividing it by the number of actions seems safer if you'd like that to become the standard.
There was a problem hiding this comment.
Yup I think this is safer; otherwise there is no good way to set the probability floor generally. Suppose 1 bandit gets called with 1000s of actions, and the other one with 5; what probability floor should we be using if we don't normalize?
| min_probability, | ||
| 1.0 / (number_of_actions + gamma * (best_score - score)), | ||
| ), | ||
| ) |
There was a problem hiding this comment.
In java, we'd do the extra step of rounding to the shard space (e.g., the closest ten thousandth) to keep weights consistent across programming languages that may have different decimal number implementations under the hood. (src)
There was a problem hiding this comment.
Hmm I figure they all use the same standard for doubles right? and even if not the precision should be much better than 1/10000 -- going to leave as is for now but happy to adjust in a future PR
There was a problem hiding this comment.
Yeah you're right, all our core languages use IEEE 754 so should be good 🤞
eppo_client/bandit.py
Outdated
| ] | ||
|
|
||
| # remaining weight goes to best action | ||
| remaining_weight = 1.0 - sum(weight for _, weight in weights) |
There was a problem hiding this comment.
Because of the minimum probability floor, we may want to defensively bound this to 0.0 so we don't end up with a negative remaining weight (src).
There was a problem hiding this comment.
Note though that the weight of all non-optimal actions should be at most 1/K -- so there should always be 1/K weight left for the optimal action (as long as probability floor < 1)
| # Note: contains tests for client.py related to bandits to avoid | ||
| # making client_test.py too long. |
| def log_assignment(self, assignment_event: Dict): | ||
| print(f"Assignment Event: {assignment_event}") | ||
|
|
||
| def log_bandit_action(self, bandit_event: Dict): | ||
| print(f"Bandit Event: {bandit_event}") |
There was a problem hiding this comment.
Minor but we may want one test that verifies the correct stuff gets passed as the vent
| assignment_logger=AssignmentLogger(), | ||
| ) | ||
| ) | ||
| sleep(0.1) # wait for initialization |
There was a problem hiding this comment.
Otherwise the test for whether the client is initialized is run before initialization
test/client_bandit_test.py
Outdated
| print(client.get_flag_keys()) | ||
| print(client.get_bandit_keys()) |
| @pytest.mark.parametrize("test_case", test_data) | ||
| def test_bandit_generic_test_cases(test_case): |
Motivation and Context
Adds contextual bandits to the Python SDK
Description
How has this been tested?