COCO / BBOB (the neutral benchmarking platform)
The independent, academic-standard yardstick the wiki’s open question kept asking for — “want a head-to-head re-run … on BBOB against neutral ground” to settle cma-es‘s low Dik placement and (μ+λ)-ES‘s lead. COCO (“COmparing Continuous Optimisers”) is the standard platform behind the BBOB (Black-Box Optimization Benchmarking) workshops, the community counterweight to any single author’s suite like Dik’s MQL5 benchmark. URL-only ingest; source = the numbbo/COCO project.
What it is
- A reusable benchmarking framework (numbbo GitHub) that runs an optimizer over fixed test suites and post-processes the runs into comparable plots/tables/PDFs — automating the experiment→figure pipeline so results are reproducible and cross-comparable.
- Core C with bindings for C/C++, Python, Java, Matlab/Octave, so an algorithm in any of these can be measured on the same problems the same way.
- Runs the BBOB workshops at GECCO since 2009 — i.e. a decade-plus of accumulated, peer-shared results, not a one-off scoring.
The methodological upgrade over Dik’s suite
This directly addresses the wiki’s benchmark-neutrality thread:
- Standard suites, not one author’s pick:
bbob(single-objective) +bbob-largescale,bbob-biobj(multi-objective), and mixed-integer suites (bbob-mixint,bbob-biobj-mixint) — including discrete/combinatorial-flavored regimes the continuous-only MQL5 suite doesn’t test (a gap the synthesis flagged for tabu-search etc.). - Fixed-target methodology: COCO’s headline measure is runtime to reach target precisions (expected running time / ERT), averaged over many instances — an anytime, budget-aware lens, unlike a single ”% of max” score. This is a more faithful test of the structure⇄generality / evaluation- budget axes the synthesis uses.
- Built on the academic test-function basis (Rastrigin/Rosenbrock/ Ackley-style landscapes, instanced and rotated to prevent overfitting to a fixed function) — the same basis the wiki already cites as the independent grounding.
Why it matters here
COCO is the closest thing the field has to a neutral referee, so it is the natural place to adjudicate the wiki’s recorded CMA-ES contradiction (cma-es ranks ~38/45 in Dik’s suite yet near-top on BBOB/COCO). It doesn’t overturn Dik — per record-don’t-overwrite, both stand — but it gives the academic side of that contradiction a concrete, reproducible home and reframes “best optimizer” as suite-and-target-relative, the no-free-lunch-theorem operationalized in a shared platform.
Caveat
A platform/methodology, not a verdict: results still depend on which suite, dimensionality, and budget you run; COCO standardizes the procedure, not the conclusion. Continuous/black-box focus (its native domain), with mixed-integer suites added.
Related
population-optimization-benchmark · test-functions-for-optimization · cma-es · no-free-lunch-theorem · metaheuristic-optimization · synthesis