Tweak the kernel, acquisition function, and noise to see how BO's assumptions shape its decisions.
Click on the plot to manually add a sample. Toggle "Show GP samples" to visualize the prior/posterior as actual functions.
What's happening: BO doesn't know the true curve. It fits a Gaussian process (GP) over what it has sampled.
The kernel encodes the GP's prior — what kind of functions it considers plausible before seeing data.
Try this: reset, click "Show GP samples" before stepping. Those squiggles are random functions drawn from the prior.
Now switch kernels — see how Matérn-3/2 produces rougher samples, RBF smoother, periodic ones repeat.
Curious how the explore/exploit knob actually works?
Click "Explore vs Exploit" at the top to see three runs racing on the same problem with different κ values.
Stats
BO evaluations
0
Best found / true optimum
—
GP Kernel prior
Default for materials science. Smooth but not infinitely so.
Smaller = wigglier. Larger = smoother.
Acquisition decision
mean + κ·std. Simple. The κ parameter directly controls explore/exploit.
κ=0 = pure exploit. κ=5 = aggressive explore.
True Function
Explore vs Exploit — Three Runs, One Function
Three BO instances optimizing the same hidden function, starting from the same initial sample,
using the same kernel — differing only in their UCB κ. Watch how the explore/exploit balance shapes the entire optimization trajectory.
UCB acquisition:acquisition(x) = μ(x) + κ · σ(x)
The mean μ is the exploitation signal, the std σ is the exploration signal.
κ is the knob that weights one against the other.
κ = 0 — Pure exploitation greedy
Evals: 0Best: —% optimum: —
κ = 2 — Balanced default
Evals: 0Best: —% optimum: —
κ = 5 — Heavy exploration explorer
Evals: 0Best: —% optimum: —
Regret over time — who's catching up?
Distance from the true optimum after each evaluation. Lower = better.
κ = 0 — pure exploit
Stuck on the first peak
Greedy BO trusts the GP's mean prediction completely. It picks argmax(μ) every iteration.
Once it finds any hill, it climbs it and refuses to leave.
Failure mode: if the first sample lands near the local peak, BO can converge to the wrong answer and never escape, no matter how many evaluations you give it.
κ = 2 — balanced
Goldilocks zone
Sample where mean+std is highest. The σ term forces some exploration, but mostly stays focused on promising areas.
Why this is the default: κ ≈ 2 lets BO escape local maxima within ~5-10 iterations on most problems while still converging quickly once it finds the global region. Most BO papers use this.
κ = 5 — heavy explore
Maps the function first, optimizes later
The σ term dominates. BO chases uncertainty, sampling far from previous points before exploiting.
Tradeoff: guaranteed to find the global peak eventually, but wastes early evaluations on regions that are obviously bad. Useful only when you have a large budget and high-stakes problems where missing the global peak is unacceptable.
Key takeaway: the "right" κ depends on your budget and tolerance for missing the global optimum.
In practice, most BO frameworks (BoTorch, scikit-learn) default to Expected Improvement (EI) instead of UCB,
which adapts the explore/exploit balance automatically without a κ parameter — but the underlying tradeoff is the same.
In Polaron's Matter 2024 paper, they don't specify their acquisition function in detail — but for a 4-dimensional manufacturing-parameter
problem with ~50 evaluations, EI or UCB with κ ≈ 2 is the standard choice.