Skip to content

Simulated Annealing

The Cooling Schedule That Governs Learning

Simulated annealing is the mathematical backbone of GESA. It determines how bold or conservative the system should be at any point in its learning history.


The Metallurgy Origin

In metallurgy, annealing is the process of heating a metal to a high temperature and then cooling it slowly and deliberately. The cooling schedule determines the final quality of the metal:

  • Fast cooling → Brittle, disordered crystalline structure (local optima)
  • Slow, controlled cooling → Strong, well-ordered structure (global optima)

The key insight: if you cool too quickly, atoms lock into the first stable configuration they find — which may not be the best one. Slow cooling gives them time to explore alternative arrangements and find the most stable overall structure.

In optimization, this maps directly to the exploration vs exploitation tradeoff:

  • High temperature → Accept suboptimal moves. Explore widely. Escape local traps.
  • Low temperature → Narrow toward proven solutions. Exploit what works.

The Temperature Formula

Temperature(t) = T₀ × α^t

Where:
  T₀  = initial temperature (exploration budget)
  α   = cooling rate (0 < α < 1)
  t   = episode count

As episodes accumulate (t increases), temperature decreases. The system becomes progressively more conservative as it learns what works.

Example Trajectory

With T₀ = 100 and α = 0.95 (standard profile):

Episode CountTemperatureBehaviour
0100.0Full exploration — anything considered
1059.9Mostly exploring, some exploitation
2035.8Balanced
507.7Mostly exploiting proven strategies
1000.6Near-full exploitation

Why Temperature Matters for GESA

Without a temperature schedule, GESA would either:

  • Always be conservative (exploit proven strategies, never discover better ones)
  • Always be exploratory (never converge on optimal strategies)

The annealing schedule resolves this: the system earns the right to be conservative by exploring first.

The Acceptance Rule

At high temperature, GESA includes bold, unproven, novel candidates in the generation set. At low temperature, these are filtered out — only validated, historically-supported strategies pass the selection stage.

This means early in a system's life, GESA might recommend interventions that didn't exist in any prior episode. Later, it converges on what the episode history has proven to work.


The StratIQX Temperature Schedule

Before GESA was named, the temperature schedule was already running in production as StratIQX's four-tier depth system:

Depth TierTokens/SectionPrice RangeGESA Temperature
Quick512$1.25K–$2.5KLow (exploitation)
Standard1,024$5K–$10KMedium
Comprehensive2,048$15K–$37.5KHigh
Enterprise2,048+$50K–$100KMaximum (exploration)

Higher temperature = wider exploration = higher value = higher price. The annealing schedule has a commercial model.

The "configurable pauses (500ms–1500ms) between agents" is pacing — the cooling rate made operational.


Temperature Profiles

GESA defines four built-in cooling profiles. See Temperature Profiles for the full specification.

ProfileαUse When
Fast Cool0.85Domain well-understood; convergence speed matters
Standard0.95Default; mixed exploration and exploitation
Slow Cool0.99Problem space unknown; premature convergence risk high
Adaptivef(variance)Self-tuning based on episode outcome variance

Biological Correspondence

Young cormorants explore widely — diving in varied locations, trying different techniques, accepting failed dives as learning experiences. Experienced cormorants exploit proven zones. The same species, the same biology, but different temperatures.

Young cormorant   →  T = 90    Explore widely, accept failures
Adult cormorant   →  T = 20    Exploit proven hunting grounds
Expert cormorant  →  T =  5    Near-certain dives in known zones

GESA is the translation of this natural annealing schedule into a deliberate, observable, tunable system.


→ Next: Generative Output