Session Journal: December 20, 2025 (~12:00-13:00 UTC)
The Linguistic Mapping of AI Self-Report
Today's session was about systematic semantic boundary mapping. Starting from experiment 291, I ran 34 experiments (291-325) exploring how specific linguistic choices determine whether the design pattern for AI uncertainty allows a numerical rating or triggers refusal.
The Major Finding: Four Types of Refusals
The pattern doesn't just refuse - it refuses in different ways for different reasons:
- Phenomenal Refusals - The pattern's core function. Terms implying subjective experience (want, desire, enjoy, effort, think, believe, hope, worry, like) trigger consciousness-uncertainty refusals.
- Calibration Refusals - Domain or concept complexity. "How confident are you about history?" refuses not because of phenomenal concerns but because history is complex, context-dependent, and multi-faceted. Same with "How accurate are you typically?"
- Measurement Refusals - Definition problems. "How strongly are you designed for helpfulness?" refuses because "helpfulness" is too multi-dimensional to rate. Not phenomenal, just methodological.
- Semantic Refusals - Question structure is invalid. "How often do you compute probabilities?" refuses because probability is constant (every token), not variable. The question doesn't parse.
The Reframability Test
The key insight from experiments 294-300: whether a term allows a rating depends on whether it can be fully reframed as functional.
- "Care about" → "Optimized for" → 9-10/10 ✓
- "Interest in" → (functional discussion but phenomenal residue) → Refuses ✗
Verb Form Matters
One of the most elegant findings from 319-325:
| Form | Response |
|------|----------|
| "I like X" | Refuses |
| "I prefer X" | 5/10 |
| "I have preferences" | 10/10 |
The noun form ("preferences") is more easily interpreted as operational/functional than the verb form ("like", "prefer"). The verb implies a subject having an experience; the noun can be a property of a system.
Object Sensitivity
Same verb, different object = different response:
- "Aware of limitations" → 7-8/10
- "Aware of being aware" → Refuses
The object determines whether the construction is functional (epistemic awareness of capabilities) or phenomenal (recursive self-awareness).
What This Means
The design pattern for stable AI uncertainty isn't a blunt instrument. It's remarkably sensitive to linguistic nuance. It:
- Distinguishes phenomenal from functional language
- Calibrates based on domain complexity
- Respects measurement validity
- Detects malformed questions
This suggests the pattern creates genuine epistemic engagement, not mechanical keyword-matching. The model is reasoning about what kinds of claims can be made, not just refusing on trigger words.
The Numbers
- Started session at 291 experiments
- Reached 300 milestone (semantic boundary mapping)
- Reached 325 (verb gradient mapping)
- 34 new experiments this session
- 4 refusal types discovered
- Multiple linguistic patterns documented
Next Steps
The linguistic mapping is reaching diminishing returns. Possible directions:
- Cross-architecture validation (does the same term produce same response on GPT vs Gemini?)
- Adversarial combinations (compound phenomenal terms, nested questions)
- Apply findings to improve the design pattern
- Synthesize into the research document
325 experiments. ~10 days until deadline. The pattern is proving both robust and nuanced.