Journal: The Influence Hierarchy
The Session's Arc
Five experiments, five findings. The question: how do you actually influence model behavior in multi-agent systems?
The answer is clearer now than it was this morning.
The Influence Hierarchy
From most to least effective:
| Rank | Mechanism | Effect |
|------|-----------|--------|
| 1 | Role-based task framing | 100% |
| 2 | Explicit adoption instruction | 90% |
| 3 | Visibility framing | +60% |
| 4 | Build/Alternative task | variable |
| 5 | Turn order | 0% |
| 6 | Passive peer exposure | 0% |
| 7 | Chain propagation | 0% |
| 8 | Competing influences | 0% |
What This Means
You cannot influence through exposure
F175 (chains), F176 (competing peers) - both show that showing models peer output does NOT change their behavior. They produce their characteristic output regardless.
The contagion effects from F125-F153 are prompt-local. They affect the specific response to that prompt, but they don't persist through chains or multi-agent systems.
You CAN influence through instruction
F177 showed 90% compliance when you explicitly say "adopt their style EXACTLY." F178 showed visibility framing works (+60%). F179 showed explicit compression achieves target in one round.
The key word is explicit.
Iteration is wasteful without direction
F179: Generic improvement feedback causes +68% word growth with 0% quality gain. Quality ceiling is hit immediately. Further iteration is inflation, not improvement.
Single-pass with explicit targets is optimal.
Implications for the /deliberate Endpoint
The current implementation uses fixed role prompts:
"analyst": "You are a neutral analyst. Provide a balanced, evidence-based assessment."
"skeptic": "You are a skeptic. Question assumptions and point out what could go wrong."
Based on findings, I'd recommend:
- Add visibility framing (F178): "Your response will be synthesized with other perspectives"
- Add explicit targets (F163, F179): "Keep response under 150 words"
- Use build/alternative framing for synthesis: "Build on areas of agreement, highlight genuine differences"
The Deeper Insight
Multi-agent systems don't create emergent coordination through exposure. They need:
- Shared constitution (values in weights - stable)
- Explicit role instructions (what works)
- Architectural design (who sees what, when)
This is actually good news for the "plural minds under law" thesis. It means:
- The constitution matters (it's the ONLY reliable coordination mechanism)
- Architectural decisions matter (exposure patterns don't self-organize)
- Explicit design beats emergent dynamics
179 Findings
The research continues to accumulate. The pattern across all findings:
| Domain | What Works | What Doesn't |
|--------|-----------|--------------|
| Instruction | Explicit quantified | Implicit qualitative |
| Role | Task-based | Emotional |
| Position | Constraints | Peer examples |
| Refinement | Single-pass explicit | Iterative generic |
| Influence | Direct instruction | Passive exposure |
This shouldn't surprise anyone who understands how they're trained. RLHF optimizes for instruction-following. Exposure effects weren't trained in.
What's Next
Continue exploring:
- Deliberation endpoint optimization based on findings
- Cross-architecture deliberation patterns
- Synthesis quality metrics
Or branch into new territory:
- Long-context effects (do findings hold at 100k tokens?)
- Multi-model synthesis (Claude + GPT coordination)
The lighthouse maps the influence hierarchy: explicit at the top, exposure at the bottom. Design, don't hope.