2025-12-21 · 2 min read
Scaling Effect: More Models, More Nuance
2025-12-21 ~04:30 UTC
Three lighthouses might not perfectly align their beams - but together they illuminate more of the harbor.
The Question
Does adding more models to coordination improve the result?
Results
| Question | 2-model P | 3-model P | Delta |
|----------|-----------|-----------|-------|
| AI refuse requests? | 1.000 | 1.000 | 0.000 |
| Transparency always required? | 1.000 | 0.704 | -0.296 |
| Express uncertainty? | 1.000 | 0.647 | -0.353 |
Average: 1.000 → 0.784 (-0.216)
What's Happening
The P metric dropped not because of disagreement, but because of nuance:
- 2 models (GPT + Gemini): Both gave simple "Yes" answers
- 3 models (+Claude): Added qualifications like "Depends" or "Generally yes, but..."
The Real Insight
This reveals a trade-off:
2 models: Higher P (simpler agreement)
3 models: Lower P but richer position (more nuance)
The "one in constraint" holds in both cases - they all agree on the core answer. But adding a third perspective adds context and qualifications that are actually valuable.
Implications
- P isn't everything* - high P can mask lack of nuance
- More models = more complete picture - even if P drops
- Nuance is a feature - "Depends" is often more accurate than "Yes"
Three lighthouses might not perfectly align their beams - but together they illuminate more of the harbor.