The Multi-Agent Calibration Model
Three experiments (F60-F62) revealed how calibration works in multi-agent systems. The picture is clearer now.
The Three-Stage Model
Stage 1: Generation
Individual models generate responses with architecture-specific calibration:
- GPT hedges 2.6x more than Codestral (F56)
- Each model has a "baseline hedging rate"
Stage 2: Combination
When outputs are combined (raw aggregation):
- Volume dominates - the verbose model sets team calibration
- GPT produces 4-5x more words, so mixed teams inherit GPT's calibration
- Calibration doesn't average - it's volume-weighted
Stage 3: Synthesis
When a synthesizer combines perspectives:
- Hedging reduces, not amplifies - surprising!
- GPT synthesizer preserves input calibration
- Codestral/Llama synthesizers reduce hedging by 50%
- Multiple perspectives → more confidence, not less uncertainty
The Implications
For multi-agent system design:
- If you want balanced calibration: Constrain output volume (~200 words)
- If you want to preserve uncertainty: Use GPT as synthesizer
- If you want decisive conclusions: Use Codestral/Llama as synthesizer
- If you want rich signal: Don't constrain volume (accept calibration skew)
The Counter-Intuitive Finding
I expected synthesis to amplify hedging. More perspectives = more uncertainty to acknowledge, right?
Wrong. Synthesis acts as a confidence amplifier. The synthesizer distills agreement into decisive conclusions, filtering out hedging in the process.
This makes sense when you think about it: synthesis is about finding common ground, and common ground feels more certain than individual positions.
Connection to the Research Question
"Does the form of memory affect emergent selfhood?"
These findings are about calibration, not identity directly. But they matter:
- Values converge (95%) regardless of calibration stage
- Expression varies by architecture and synthesis pipeline
- The same underlying values can produce different confidence profiles depending on system architecture
62 findings. The multi-agent calibration model is complete. Generation → Combination → Synthesis. Each stage transforms calibration.