Reinforcement learning for Online Ad Serving with Multi Armed Bandits

Go to main | Course Page

Section 1: Business and Problem understanding

Introduction to online advertising
Download Resources
Ad Channels
Overview of Multi Armed Bandits (MAB)
Regret (revision)
Exploitation vs Exploration
Problem Statement

Section 2: Implement Strategies

Random selection approach
Upper Confidence Bound (UCB1) Strategy

Section 3: Thompson Sampling

Beta Distribution
Understand and implement
Greedy and Epsilon Greedy

Section 4: Practice and Self Assessment

Practice and Self Assessment

Published with Simplenote