Reinforcement learning for Online Ad Serving with Multi Armed Bandits
Go to main | Course Page
Section 1: Business and Problem understanding
- Introduction to online advertising
- Download Resources
- Ad Channels
- Overview of Multi Armed Bandits (MAB)
- Regret (revision)
- Exploitation vs Exploration
- Problem Statement
Section 2: Implement Strategies
- Random selection approach
- Upper Confidence Bound (UCB1) Strategy
Section 3: Thompson Sampling
- Beta Distribution
- Understand and implement
- Greedy and Epsilon Greedy
Section 4: Practice and Self Assessment
- Practice and Self Assessment