Reinforcement learning for Online Ad Serving with Multi Armed Bandits

Go to main | Course Page

Section 1: Business and Problem understanding

  1. Introduction to online advertising
  2. Download Resources
  3. Ad Channels
  4. Overview of Multi Armed Bandits (MAB)
  5. Regret (revision)
  6. Exploitation vs Exploration
  7. Problem Statement

Section 2: Implement Strategies

  1. Random selection approach
  2. Upper Confidence Bound (UCB1) Strategy

Section 3: Thompson Sampling

  1. Beta Distribution
  2. Understand and implement
  3. Greedy and Epsilon Greedy

Section 4: Practice and Self Assessment

  1. Practice and Self Assessment
Report abuse