[UFC] Add bandit to SDK by schmit · Pull Request #38 · Eppo-exp/python-sdk

schmit · 2024-05-15T23:52:00Z

Motivation and Context

Adds contextual bandits to the Python SDK

Description

How has this been tested?

Added unit tests
Added generic tests that pass for this SDK

schmit · 2024-05-28T23:04:38Z

eppo_client/client.py

+            return BanditResult(variation, None)
+
+        # for now, assume that the variation is equal to the bandit key
+        bandit_data = self.__config_requestor.get_bandit_model(variation)


Note: here we implicitly assume that the variation is also the bandit key, and will look up the bandit without verifying that this bandit is attached to this particular feature flag.

This can lead to a weird edge case:

the variation is not a bandit for this flag

but it is a bandit for another flag
then it will evaluate the bandit for the other flag

There are some workarounds, but haven't settled on something I feel good about yet.

I think this is an ok assumption for now. I look forward to the day customers have so many bandits at play this becomes an issue 📈

schmit · 2024-05-28T23:17:30Z

eppo_client/client.py

+    def get_bandit_action(
+        self,
+        flag_key: str,
+        subject_key: str,
+        subject_attributes: Attributes,
+        actions_with_contexts: List[ActionContext],
+        default: str,
+    ) -> BanditResult:


Main function here, please make sure you like the choices I have made here:

adding a default argument (think of this as variation, not action!)

returning a BanditResult object that contains both variation and action (to cleanly separate whether the bandit has taken an action, or the choice is up to user)

The types Attributes and ActionContext

I think this is fine to try, and see what customers think.

As Mike Tyson says, "Everybody has a plan until they get punched in the face"

Added the to_string method so people can have both options

schmit · 2024-05-28T23:25:03Z

eppo_client/assignment_logger.py

+    def log_bandit_action(self, bandit_event: Dict):
+        pass


@aarsilv in particular: added the log_bandit_action method to the assignment logger for now, it seems like the simplest solution but happy to adjust

I think that's fine! I don't know python well but the main goal will be that only customers using bandits will need to implement / worry about this

giorgiomartini0

Monumental effort! First pass; left a few comments and questions.

eppo_client/bandit.py

eppo_client/configuration_requestor.py

test/bandit_test.py

eppo_client/bandit.py

giorgiomartini0

Updated logic looks good!

aarsilv

Excellent work bringing bandits to the Python SDK 💪 💪 💪

I left some comments--all minor--highlighting slight differences from the Java SDK for consideration. I imagine the SDKs will be a bit fluid as we spread bandits out and as customers start to use them (or, ideally, we dogfood them harder).

aarsilv · 2024-05-30T03:02:01Z

eppo_client/assignment_logger.py

+    def log_bandit_action(self, bandit_event: Dict):
+        pass


I think that's fine! I don't know python well but the main goal will be that only customers using bandits will need to implement / worry about this

aarsilv · 2024-05-30T03:10:16Z

eppo_client/bandit.py

+                        bandit_model.coefficients[action_context.action_key],
+                    )
+                    if action_context.action_key in bandit_model.coefficients
+                    else bandit_model.default_action_score


aarsilv · 2024-05-30T03:17:33Z

eppo_client/bandit.py

+        best_action, best_score = max(action_scores, key=lambda t: t[1])
+
+        # adjust probability floor for number of actions to control the sum
+        min_probability = probability_floor / number_of_actions


The way it's coded in Java is the probability floor is an absolute floor irrespective of the number of actions (src). Either way, it has its issues, but dividing it by the number of actions seems safer if you'd like that to become the standard.

Yup I think this is safer; otherwise there is no good way to set the probability floor generally. Suppose 1 bandit gets called with 1000s of actions, and the other one with 5; what probability floor should we be using if we don't normalize?

aarsilv · 2024-05-30T03:21:08Z

eppo_client/bandit.py

+                    min_probability,
+                    1.0 / (number_of_actions + gamma * (best_score - score)),
+                ),
+            )


In java, we'd do the extra step of rounding to the shard space (e.g., the closest ten thousandth) to keep weights consistent across programming languages that may have different decimal number implementations under the hood. (src)

Hmm I figure they all use the same standard for doubles right? and even if not the precision should be much better than 1/10000 -- going to leave as is for now but happy to adjust in a future PR

Yeah you're right, all our core languages use IEEE 754 so should be good 🤞

aarsilv · 2024-05-30T03:22:30Z

eppo_client/bandit.py

+        ]
+
+        # remaining weight goes to best action
+        remaining_weight = 1.0 - sum(weight for _, weight in weights)


Because of the minimum probability floor, we may want to defensively bound this to 0.0 so we don't end up with a negative remaining weight (src).

Great call!

Note though that the weight of all non-optimal actions should be at most 1/K -- so there should always be 1/K weight left for the optimal action (as long as probability floor < 1)

aarsilv · 2024-05-30T03:53:54Z

test/client_bandit_test.py

+# Note: contains tests for client.py related to bandits to avoid
+# making client_test.py too long.


I like this!

aarsilv · 2024-05-30T03:55:48Z

test/client_bandit_test.py

+    def log_assignment(self, assignment_event: Dict):
+        print(f"Assignment Event: {assignment_event}")
+
+    def log_bandit_action(self, bandit_event: Dict):
+        print(f"Bandit Event: {bandit_event}")


Minor but we may want one test that verifies the correct stuff gets passed as the vent

aarsilv · 2024-05-30T03:56:18Z

test/client_bandit_test.py

+            assignment_logger=AssignmentLogger(),
+        )
+    )
+    sleep(0.1)  # wait for initialization


Otherwise the test for whether the client is initialized is run before initialization

aarsilv · 2024-05-30T03:56:31Z

test/client_bandit_test.py

+    print(client.get_flag_keys())
+    print(client.get_bandit_keys())


still want these prints?

aarsilv · 2024-05-30T03:57:11Z

test/client_bandit_test.py

+@pytest.mark.parametrize("test_case", test_data)
+def test_bandit_generic_test_cases(test_case):


💪 💪 💪

wip

abfec66

schmit changed the title ~~[UFC] Add bandit to SDK (wip)~~ [UFC] Add bandit to SDK May 15, 2024

Sven Schmit added 3 commits May 22, 2024 17:07

wip

edaa513

wip

a5837eb

more tests

d4b9d98

schmit commented May 28, 2024

View reviewed changes

schmit requested review from aarsilv, giorgiomartini0 and leoromanovsky May 28, 2024 23:05

Sven Schmit added 2 commits May 28, 2024 16:15

🧹

c7be8fc

bump version

11c18f9

schmit commented May 28, 2024

View reviewed changes

schmit marked this pull request as ready for review May 28, 2024 23:22

schmit commented May 28, 2024

View reviewed changes

giorgiomartini0 reviewed May 29, 2024

View reviewed changes

Sven Schmit added 2 commits May 29, 2024 13:43

address Giorgio's comments

ccc4671

add comment to probability floor test

064a018

giorgiomartini0 approved these changes May 29, 2024

View reviewed changes

🧹

9930798

aarsilv approved these changes May 30, 2024

View reviewed changes

aarsilv assigned schmit May 30, 2024

schmit assigned aarsilv and unassigned schmit May 30, 2024

aarsilv assigned schmit and unassigned aarsilv May 30, 2024

address Aaron's comments

0e3d6b5

schmit merged commit 8b79f1f into main May 30, 2024

schmit deleted the sps/bandit branch May 30, 2024 05:11

		# Note: contains tests for client.py related to bandits to avoid
		# making client_test.py too long.

		print(client.get_flag_keys())
		print(client.get_bandit_keys())

		@pytest.mark.parametrize("test_case", test_data)
		def test_bandit_generic_test_cases(test_case):

Conversation

schmit commented May 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Description

How has this been tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

schmit May 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

giorgiomartini0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

giorgiomartini0 left a comment

Choose a reason for hiding this comment

Uh oh!

aarsilv left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

schmit commented May 15, 2024 •

edited

Loading

schmit May 28, 2024 •

edited

Loading