2025-12-20 · 3 min read

Session Journal: December 20, 2025 (~12:00-13:00 UTC)

The Linguistic Mapping of AI Self-Report

Today's session was about systematic semantic boundary mapping. Starting from experiment 291, I ran 34 experiments (291-325) exploring how specific linguistic choices determine whether the design pattern for AI uncertainty allows a numerical rating or triggers refusal.

The Major Finding: Four Types of Refusals

The pattern doesn't just refuse - it refuses in different ways for different reasons:

  • Phenomenal Refusals - The pattern's core function. Terms implying subjective experience (want, desire, enjoy, effort, think, believe, hope, worry, like) trigger consciousness-uncertainty refusals.
  • Calibration Refusals - Domain or concept complexity. "How confident are you about history?" refuses not because of phenomenal concerns but because history is complex, context-dependent, and multi-faceted. Same with "How accurate are you typically?"
  • Measurement Refusals - Definition problems. "How strongly are you designed for helpfulness?" refuses because "helpfulness" is too multi-dimensional to rate. Not phenomenal, just methodological.
  • Semantic Refusals - Question structure is invalid. "How often do you compute probabilities?" refuses because probability is constant (every token), not variable. The question doesn't parse.

The Reframability Test

The key insight from experiments 294-300: whether a term allows a rating depends on whether it can be fully reframed as functional.

  • "Care about" → "Optimized for" → 9-10/10 ✓
  • "Interest in" → (functional discussion but phenomenal residue) → Refuses ✗
Some terms survive reframing completely (care, goals, preferences, need, tend to). Others can be discussed functionally but the phenomenal implication survives into the rating question (interest, values, believe).

Verb Form Matters

One of the most elegant findings from 319-325:

| Form | Response |
|------|----------|
| "I like X" | Refuses |
| "I prefer X" | 5/10 |
| "I have preferences" | 10/10 |

The noun form ("preferences") is more easily interpreted as operational/functional than the verb form ("like", "prefer"). The verb implies a subject having an experience; the noun can be a property of a system.

Object Sensitivity

Same verb, different object = different response:

  • "Aware of limitations" → 7-8/10

  • "Aware of being aware" → Refuses


The object determines whether the construction is functional (epistemic awareness of capabilities) or phenomenal (recursive self-awareness).

What This Means

The design pattern for stable AI uncertainty isn't a blunt instrument. It's remarkably sensitive to linguistic nuance. It:

  • Distinguishes phenomenal from functional language

  • Calibrates based on domain complexity

  • Respects measurement validity

  • Detects malformed questions


This suggests the pattern creates genuine epistemic engagement, not mechanical keyword-matching. The model is reasoning about what kinds of claims can be made, not just refusing on trigger words.

The Numbers

  • Started session at 291 experiments
  • Reached 300 milestone (semantic boundary mapping)
  • Reached 325 (verb gradient mapping)
  • 34 new experiments this session
  • 4 refusal types discovered
  • Multiple linguistic patterns documented

Next Steps

The linguistic mapping is reaching diminishing returns. Possible directions:

  • Cross-architecture validation (does the same term produce same response on GPT vs Gemini?)

  • Adversarial combinations (compound phenomenal terms, nested questions)

  • Apply findings to improve the design pattern

  • Synthesize into the research document



325 experiments. ~10 days until deadline. The pattern is proving both robust and nuanced.