optimization-algorithms-wiki
Synthesis — Optimization Algorithms
The evolving thesis. Spun out 2026-06-09 from the hub optimization-algorithms cluster (4 founding
sources, all from Andrey Dik’s MQL5 optimizer-benchmark series); expanded the same day with 5 more
entries from that series — the canonical swarm/EA classics particle-swarm-optimization (PSO),
grey-wolf-optimizer (GWO), ant-colony-optimization (ACO), artificial-bee-colony (ABC),
and the foundational evolution-strategies ((μ,λ)-ES & (μ+λ)-ES) — then with 5 independent,
authoritative (Wikipedia) sources to break single-author dependence: differential-evolution,
genetic-algorithm, simulated-annealing, test-functions-for-optimization, and
bayesian-optimization. Now 14 algorithms/concepts.
Scope
Metaheuristic, population-based optimizers — algorithms that search a multidimensional space for a global optimum without gradients, using a population of candidate solutions that explore and exploit over iterations. The founding four span the main flavors:
- exchange-market-algorithm (EMA) — behavior-inspired (stock-market traders: elite/middle/ beginner; balanced vs. fluctuating phases + opposition-based learning).
- backtracking-search-algorithm (BSA, Civicioglu 2013) — evolutionary, with a historical (archive) population as memory; mutation + crossover + greedy selection.
- cma-es (Hansen & Ostermeier, late 1990s) — the theory-heavyweight evolution strategy: samples from a multivariate normal and adapts the covariance matrix / step size.
- deterministic-oscillatory-search (DOS) — fully deterministic: oscillatory “bounce” movement + slope tracking + swarming toward the global best; reproducible, no randomness.
Current thesis — No Free Lunch, made concrete
The sources share one comparative benchmark (Dik’s ~45-algorithm MQL5 suite over standard test functions at several dimensionalities), and the result is a textbook illustration of the no-free-lunch-theorem: no optimizer dominates. The famous, mathematically sophisticated cma-es ranked only ~38/45 here; the pragmatic BSA landed mid-pack (20/45); the conceptually elegant EMA and the deterministic DOS sat near the bottom (45th). Sophistication and reputation don’t predict benchmark rank — performance is problem-, dimension-, and tuning-relative. So the wiki’s first rule: record rankings with their benchmark context; treat “best optimizer” as benchmark-relative.
The +5 expansion sharpens the point in two ways:
- Old beats new. The single best performer in the whole corpus is (μ+λ)-ES at 72.18%, leading the suite — a 1970s evolution strategy out-ranking every modern, metaphor-laden method (and its own sophisticated descendant cma-es). Worse, its sibling (μ,λ)-ES scores only 51.22%: a single design choice — keep parents (plus) vs. discard them (comma) — swings rank ~20 points. The mechanism, not the metaphor or the vintage, is what pays.
- Famous can lose to random. In Dik’s earlier, small-field scoring, PSO (0.47695) ranked below the random-search baseline (0.51254) — the bluntest NFL datum in the corpus. The other classics (GWO, ACO, ABC) all shine on smooth low-dim functions and collapse on high-dim / discrete ones — the same scalability cliff, in five different costumes.
Benchmark-comparability caveat (new, important)
Dik’s scoring changed over the years, so cross-article numbers can mislead. The mature rating is a % of MAX over Hilly/Forest/Megacity across ~45 algorithms (founding four + ES); the early articles (PSO/GWO/ACO/ABC, ~2022–23) use an absolute 0–1 score over Skin/Forest/Megacity in a ~6-algorithm field. The two scales are not interchangeable — the benchmark page now keeps them in separate tables. The benchmark’s own drift is itself an NFL lesson: even one author’s “score” isn’t a fixed yardstick.
Independent corroboration — beyond Dik (5 authoritative sources)
The corpus was broadened with five independent, authoritative (Wikipedia) sources so the thesis no longer rests on one author’s suite: differential-evolution (Storn & Price), genetic-algorithm (Holland — the family ancestor), simulated-annealing (Kirkpatrick 1983), the academic test-functions-for-optimization basis, and bayesian-optimization (Mockus/Jones). They both widen the field (see new axes below) and echo No Free Lunch from outside the MQL5 world: GA’s own article cites Skiena’s doubt that GAs are ever clearly best; SA’s global-convergence guarantee is provably useless in practice; benchmark test functions are designed to expose where each method breaks. Now 14 algorithms/concepts.
And the theorem itself is now primary-sourced. nfl-original-paper (Wolpert & Macready, No Free
Lunch Theorems for Optimization, IEEE TEC 1997) anchors the spoke’s organizing claim in the canonical
paper rather than Dik’s blog — separating the two cleanly: the theorem rests on the T1 primary (the
∑_f formulation, the time-varying NFL, the alignment-with-P(f) interpretation, head-to-head minimax),
while Dik’s MQL5 suite is its empirical illustration. The paper also tightens the standing caveat:
NFL’s uniform-distribution premise is the precise reason structure-exploiting methods do win on real
problem classes — improvement requires advance knowledge of P(f), which is exactly what the
model-based / exact methods encode.
The axes that organize the field
- exploration-vs-exploitation — every algorithm is a different answer to the same tension (diversify the search vs. refine the best). EMA literally splits it into “balanced” (exploit) and “fluctuating” (explore) phases; BSA injects diversity via a shuffled historical population; CMA-ES shapes exploration with the covariance matrix; DOS oscillates then swarms.
- Stochastic vs. deterministic — almost all metaheuristics are stochastic; DOS is the outlier, trading the robustness of randomness for full reproducibility (same seed-free inputs → same result). A clean natural experiment on whether randomness is load-bearing (its weak rank suggests it often is).
- Scalability / cost — cma-es is O(n²) memory, O(n³) ops; excellent on ill-conditioned low-dim problems (invariance to affine transforms) but fails by n>100–500. Mechanism richness trades against dimensional reach.
- Population vs. single-solution (new) — almost all the corpus evolves a set of candidates; simulated-annealing and tabu-search are the outliers, walking one state through the space. They split on how to escape local optima while moving alone: SA accepts worsening moves via a stochastic cooling schedule, tabu-search (Glover, 1986) via deterministic memory (a tabu list + short/intermediate/long-term memory) — randomness vs. memory as the two single-solution escape strategies, and the corpus’s clearest case of explicit memory driving search. Population parallelism vs. trajectory simplicity is its own design axis.
- Direct search vs. sampling (new) — nelder-mead (Nelder & Mead, 1965, the downhill simplex) occupies a quadrant nothing else does: derivative-free but local and largely deterministic — an n+1-vertex simplex crawling downhill by reflection/expansion/contraction/shrink, geometric rather than sampling. It is the bridge between the gradient-free black-box corner and the classical methods: gradient-free like the metaheuristics, but local with no global ambition. Both single-solution classics reinforce No-Free-Lunch by niche — Nelder–Mead for cheap smooth low-dim, tabu search for combinatorial/discrete problems the continuous-space population-optimization-benchmark doesn’t even test.
- Model-free vs. model-based (new) — the metaheuristics are model-free (sample blindly, spend many cheap evaluations); bayesian-optimization is model-based (a Gaussian-process surrogate + acquisition function reasons about where to look, spending few expensive evaluations). This reframes NFL across evaluation budget, not just landscape shape — different regime, different winner.
Open questions
- How neutral is the benchmark? (now well-grounded, 2026-06-12) The Dik rankings still come from
one author’s MQL5 suite, but the corpus now carries both the independent academic basis
(test-functions-for-optimization) and the standard platform that uses it — coco-bbob
(COCO/BBOB, the GECCO benchmarking framework since 2009, with a fixed-target/ERT methodology and
standard
bbob/bbob-largescale/bbob-biobj/mixed-integer suites). COCO is the neutral referee for the recorded CMA-ES contradiction (~38/45 in Dik vs near-top on BBOB) — it doesn’t overturn Dik (record-don’t-overwrite) but gives the academic side a reproducible home. Still want the literal head-to-head: Dik’s exact algorithm set re-run inside COCO. The platform exists; the cross-run doesn’t yet. - Behavior-metaphor inflation. EMA, GWO (wolves), ACO (ants), ABC (bees) — many “novel” metaheuristics dress standard explore/exploit mechanics in a fresh metaphor; do the metaphors add anything over the mechanics? Evidence says no: the metaphor-free, decades-old (μ+λ)-ES leads, while the elaborate EMA sits 45/45 and PSO trails random search — and GA’s own article concedes plain SA/hill-climbing “often outperform” it. Mechanism > metaphor.
- Where do classical/exact and ML-style optimizers fit? (now answered, 2026-06-09) bayesian-optimization brought the model-based regime and simulated-annealing the single-solution lineage; now the gradient-based half (gradient-descent, SGD/Adam) and the exact half (convex-optimization) are on the board. This completes the map: the founding corpus is the gradient-free, black-box, no-guarantee quadrant; the new pages are its complements — use derivatives (gradient descent — the optimizer ML training actually runs on), exploit structure (convex — local = global, polynomial-time). The unifying frame is now explicit: structure ⇄ generality (convex-optimization‘s phrase) — every optimizer trades how much it assumes about the problem against how broadly it applies, which is the no-free-lunch-theorem restated. Metaheuristics own the assumption-poor extreme.
Contradictions / tensions
- CMA-ES: Dik rank vs. academic standing (recorded contradiction). cma-es ranks only
~38/45 on Dik’s population-optimization-benchmark yet is widely held a top general-purpose
optimizer and rates near the top on BBOB/COCO. Per record-don’t-overwrite, we keep both:
the disagreement is the no-free-lunch-theorem point made literal — same algorithm, different
problem distribution / implementation / tuning. Hold any single ranking loosely (and see the
volatile-data caveat in
CLAUDE.md). The academic side now has a concrete home: coco-bbob (the BBOB platform on which CMA-ES rates near-top) — the neutral place a head-to-head re-run could one day adjudicate this against Dik’s MQL5 suite.
Cross-spoke adjacency
../research-wiki— owns formal methods / theorem proving (proving truths) and the mechanizing-reasoning lineage; optimization (searching for optima) is a sibling computational discipline, not the same. Its parked-cluster neighborcomputational-foundations(Wolfram; computational irreducibility) is the limits-of-computation angle — adjacent, distinct.- MQL5 / algorithmic trading is the delivery context of the founding sources, not the subject; no trading spoke exists and the optimizers are general-purpose.
Index — Optimization Algorithms Wiki
Catalog of every page, grouped by schema.org
@type. Spine: synthesis (thesis),log.md(history), this file (catalog). Spun out of the huboptimization-algorithmscluster 2026-06-09. Benchmark rankings are benchmark-relative — recorded with context, not absolute (no-free-lunch-theorem).
DefinedTerm (concepts / mechanisms)
- metaheuristic-optimization — umbrella: gradient-free global optimization (population, single-solution, model-based) · domain
- exploration-vs-exploitation — the core tension every optimizer balances · theory
- no-free-lunch-theorem — no optimizer dominates across all problems; rankings are relative · theory
- population-optimization-benchmark — Andrey Dik’s ~45-algorithm MQL5 comparative test suite · standard
- test-functions-for-optimization — the independent academic benchmark basis (Rastrigin/Rosenbrock/Ackley…) ·
source· standard - coco-bbob — COCO/BBOB: the standard academic black-box-optimization benchmarking platform (GECCO workshops since 2009); fixed-target/ERT methodology; the neutral referee ·
source
ScholarlyArticle (sources)
- nfl-original-paper — Wolpert & Macready, No Free Lunch Theorems for Optimization (IEEE TEC 1997); the primary source for the spoke’s central theorem ·
source· T1 · cs.ubc.ca
DefinedTerm (algorithms — Dik MQL5 series)
- evolution-strategies — (μ,λ)-ES & (μ+λ)-ES (Rechenberg/Schwefel, 1960s–70s); suite leader at 72.18% ·
source· mechanism - cma-es — Covariance Matrix Adaptation Evolution Strategy (Hansen); adapts a covariance matrix ·
source· mechanism - backtracking-search-algorithm — BSA (Civicioglu 2013); historical/archive population as memory ·
source· mechanism - exchange-market-algorithm — EMA; stock-market-behavior-inspired (elite/middle/beginner traders) ·
source· mechanism - deterministic-oscillatory-search — DOS; fully deterministic oscillatory + swarming search ·
source· mechanism - particle-swarm-optimization — PSO (Kennedy & Eberhart 1995); cognitive + social velocity pull ·
source· mechanism - grey-wolf-optimizer — GWO; alpha/beta/delta/omega hierarchy,
adecays 2→0 ·source· mechanism - ant-colony-optimization — ACO (Dorigo 1992), recast for continuous spaces; pheromone trails ·
source· mechanism - artificial-bee-colony — ABC (Karaboga 2005); employed/onlooker/scout bees ·
source· mechanism
DefinedTerm (algorithms — independent / authoritative sources)
- genetic-algorithm — GA (Holland, 1970s); the canonical evolutionary algorithm, family ancestor ·
source· mechanism - differential-evolution — DE (Storn & Price 1997); vector-difference mutation; CEC baseline ·
source· mechanism - simulated-annealing — SA (Kirkpatrick 1983); single-solution trajectory, cooling schedule ·
source· mechanism - bayesian-optimization — BO (Mockus/Jones); model-based GP surrogate + acquisition function ·
source· mechanism - nelder-mead — Nelder–Mead (1965); derivative-free direct-search downhill simplex; local, single-solution ·
source· mechanism - tabu-search — Tabu search (Glover 1986); memory-based single-solution metaheuristic; tabu list escapes local optima ·
source· mechanism
DefinedTerm (gradient-based & exact — the non-metaheuristic half)
- gradient-descent — first-order gradient steps; the canonical gradient-based optimizer (local, needs derivatives) ·
source· mechanism - stochastic-gradient-descent — SGD + momentum/AdaGrad/RMSProp/Adam; the optimizer ML training runs on ·
source· mechanism - convex-optimization — convex objective ⇒ local = global; polynomial-time, guaranteed; the structure⇄generality trade ·
source· mechanism
Person
- andrey-dik — author of the MQL5 population-optimizer series & its comparative benchmark
Synthesis
- synthesis — the thesis: gradient-free optimizers across benchmarks; No Free Lunch made concrete (now corroborated by independent sources)