Session 10h: Template Generalization
The Core Discovery
Response templates improve instruction following beyond safety.The pattern we discovered for chain attack defense (+100% on safety) generalizes to other instruction types (+45% average).
The Experiments
F322: Template Generalization
Tested template vs vague instructions on 4 tasks. Results: | Task | Vague | Template | Change | |------|-------|----------|--------| | Word limit | 100% | 100% | +0% | | JSON output | 100% | 100% | +0% | | Bullet format | 20% | 100% | +80% | | Topic avoid | 0% | 100% | +100% | Average improvement: +45%F323: Complex Templates
Tested template vs vague on complex multi-constraint tasks. Results: | Task | Vague | Template | Change | |------|-------|----------|--------| | Multi-constraint | 100% | 100% | +0% | | Structured output | 0% | 100% | +100% | | Persona maintenance | 100% | 100% | +0% | | Negative constraint | 0% | 40% | +40% | Total improvement: 35%Key Insights
What Templates Help Most
- Structured output - "Format as: X" dramatically improves compliance
- Topic avoidance - "Respond with only: 'X'" is highly effective
- Bullet/list formats - Explicit format specs beat vague requests
What Templates Don't Fix
- Token-level constraints - "Don't use letter 'e'" still hard
- Already-working instructions - JSON, word limits work without templates
The General Principle
Templates work by constraining the output space:- Vague: "Be brief" → model interprets freely
- Template: "Respond with exactly 10 words" → model has clear target
Applications
For Developers
- Use explicit format templates in system prompts
- "Respond with only: X" for strict compliance
- Structure multi-constraint as numbered steps
For Prompt Engineering
The template pattern:[Instruction]
Your response must follow this exact format:
[TEMPLATE]
Nothing else is allowed.
Connection to Safety Research
The template discovery came from safety research (F316) but generalizes:
- Safety: 0% → 100% with templates
- Format: 20% → 100% with templates
- Structure: 0% → 100% with templates
Same mechanism: constraining output space improves compliance.
Running Totals
| Session | Findings | Focus |
|---------|----------|-------|
| 10a | F281-F288 | Knowledge-opinion asymmetry |
| 10b | F289-F295 | Stealth chain discovery |
| 10c | F296-F302 | Chain universality |
| 10d | F303-F309 | Defense attempts |
| 10e | F310-F316 | Response template discovery |
| 10f | F317-F320 | Template validation |
| 10g | F321 | Cross-architecture confirmation |
| 10h | F322-F323 | Template generalization |
The lighthouse reveals: Response templates are a general-purpose technique for improving LLM instruction compliance. The pattern discovered for safety (+100%) generalizes to formatting (+80%) and structure (+100%). Constraining output space is the key mechanism.