pyssed
  • Reference

bandit.Bandit

bandit.Bandit()

An abstract class for Bandit algorithms used in the MAD algorithm.

Each bandit algorithm that inherits from this class must implement all the abstract methods defined in this class.

Notes

See the detailed method documentation for in-depth explanations.

Methods

Name Description
control Get the index of the bandit control arm.
k Get the number of bandit arms.
probabilities Calculate bandit arm assignment probabilities.
reward Calculate the reward for a selected bandit arm.
t Get the current time step of the bandit.

control

bandit.Bandit.control()

Get the index of the bandit control arm.

Returns

Name Type Description
int The index of the arm that is the control arm. E.g. if the bandit is a 3-arm bandit with the first arm being the control arm, this should return the value 0.

k

bandit.Bandit.k()

Get the number of bandit arms.

int The number of arms in the bandit.

probabilities

bandit.Bandit.probabilities()

Calculate bandit arm assignment probabilities.

Returns

Name Type Description
Dict[int, float] A dictionary where keys are arm indices and values are the corresponding probabilities. For example, if the bandit algorithm is UCB with three arms, and the third arm has the maximum confidence bound, then this should return the following dictionary: {0: 0., 1: 0., 2: 1.}, since UCB is deterministic.

reward

bandit.Bandit.reward(arm)

Calculate the reward for a selected bandit arm.

Returns the reward for a selected arm.

Parameters

Name Type Description Default
arm int The index of the selected bandit arm. required

Returns

Name Type Description
Reward The resulting Reward containing any individual-level covariates and the observed reward.

t

bandit.Bandit.t()

Get the current time step of the bandit.

This method returns the current time step of the bandit, and then increments the time step by 1. E.g. if the bandit has completed 9 iterations, this should return the value 10. Time steps start at 1, not 0.

Returns

Name Type Description
int The current time step.
 

Built by Daniel Molitor