Multi-armed bandit solver using Thompson Sampling for ad click-through rate (CTR) optimization.
Simulates an online ad-selection scenario where an algorithm must decide which of 10 ads to show in each round, learning from binary rewards (click / no click) to maximise total clicks over 10,000 rounds.
Thompson Sampling is a Bayesian approach to the exploration–exploitation trade-off:
- Model — each ad's unknown CTR is modelled with a Beta distribution:
Beta(α, β)whereα = successes + 1andβ = failures + 1. - Sample — draw a random value from every ad's current Beta distribution.
- Select — pick the ad whose sample is highest.
- Update — observe the reward and increment
α(click) orβ(no click).
Because the Beta distribution naturally narrows as evidence accumulates, the algorithm explores uncertain ads early on and gradually exploits the best performer.
Ads_CTR_Optimisation.csv — 10,000 rows × 10 columns. Each cell is a binary reward (1 = click, 0 = no click) from a simulated ad campaign.
| Language | Packages |
|---|---|
| Python 3 | matplotlib, pandas |
| R | base (no extra packages) |
pip install -r requirements.txt
python thompson_sampling.pyRscript thompson_sampling.R
# or inside an R session:
source("thompson_sampling.R")Both scripts print the total reward and display a histogram showing which ad the algorithm converged on.
| 🐍 Python 3 | Core implementation |
| 📊 R | Alternative implementation |
| 📈 Matplotlib | Visualisation |
| 🐼 pandas | Data loading |
| 📐 Beta Distribution | Bayesian prior/posterior |
- Dataset is fully simulated; real-world CTRs would require a live feedback loop.
- No command-line arguments yet — round count and ad count are constants at the top of each script.
- The R script's
sys.frame(1)$ofilepath resolution only works when the file issource()-d; when run withRscriptit falls back togetwd().
MIT