Skip to content

Add Thompson Sampling to ReAgent MAB and refactor the UCB classes and methods to unify#565

Closed
alexnikulkov wants to merge 1 commit intofacebookresearch:mainfrom
alexnikulkov:export-D31642370
Closed

Add Thompson Sampling to ReAgent MAB and refactor the UCB classes and methods to unify#565
alexnikulkov wants to merge 1 commit intofacebookresearch:mainfrom
alexnikulkov:export-D31642370

Conversation

@alexnikulkov
Copy link
Contributor

Summary:

  1. Add 2 Thompson sampling MAB algorithms: 1 for Bernoulli rewards, 1 for Normal rewards
  2. Refactor UCB code so that Thompson sampling could reuse as much as possible

Differential Revision: D31642370

@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D31642370

@codecov-commenter
Copy link

codecov-commenter commented Oct 15, 2021

Codecov Report

Merging #565 (70eea9b) into main (57b58a8) will increase coverage by 0.03%.
The diff coverage is 97.95%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #565      +/-   ##
==========================================
+ Coverage   86.65%   86.69%   +0.03%     
==========================================
  Files         337      339       +2     
  Lines       20955    21004      +49     
  Branches       44       44              
==========================================
+ Hits        18159    18209      +50     
+ Misses       2770     2769       -1     
  Partials       26       26              
Impacted Files Coverage Δ
reagent/mab/mab_algorithm.py 93.93% <93.93%> (ø)
reagent/mab/thompson_sampling.py 97.87% <97.87%> (ø)
reagent/mab/ucb.py 82.05% <100.00%> (-6.75%) ⬇️
reagent/test/mab/test_mab.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57b58a8...70eea9b. Read the comment docs.

… methods to unify (facebookresearch#565)

Summary:
Pull Request resolved: facebookresearch#565

1. Add 2 Thompson sampling MAB algorithms: 1 for Bernoulli rewards, 1 for Normal rewards
2. Refactor UCB code so that Thompson sampling could reuse as much as possible

Differential Revision: D31642370

fbshipit-source-id: f5f9e2227bef9b5caafee9c3894f494be0e9e1a5
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D31642370

@facebook-github-bot
Copy link

This pull request has been merged in 471defa.

xuruiyang pushed a commit that referenced this pull request Sep 20, 2025
… methods to unify (#565)

Summary:
Pull Request resolved: #565

1. Add 2 Thompson sampling MAB algorithms: 1 for Bernoulli rewards, 1 for Normal rewards
2. Refactor UCB code so that Thompson sampling could reuse as much as possible

Reviewed By: czxttkl

Differential Revision: D31642370

fbshipit-source-id: c4447a22ad11e1bb9696cf269ea9f45523d22f28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants