THE DIFFERENTIAL RISK PROBLEM
Why AI Companion Risk Cannot Be Assessed Independently of User Architecture
Trinket Soul Framework
Brief No. 23
Michael S. Moniz
February 2026
A supplementary brief to the Trinket Soul Framework series
Creative Commons Attribution-NonCommercial-ShareAlike 4.0
A NOTE ON SCOPE AND INTENT
Brief No. 5 (The True Economy Certification) proposes a standards framework for evaluating AI companion systems based on their structural properties: whether they disclose their limitations, whether they simulate reciprocity, whether they create artificial scarcity dynamics. Volume III (The True Economy Audit) expands these into a full evaluation methodology. Both documents share an implicit assumption: that a system’s risk profile is a property of the system itself.
Volume IV (The Architecture of the Self) establishes that this assumption is wrong. The same AI companion interacts with fundamentally different user architectures—different template calibrations, different internal economy states, different degradation pathways, different frozen fields. The risk the system poses is not a single number. It is a function of the system’s properties and the user’s architecture.
This brief formalizes the problem and proposes a differential risk framework that supplements, rather than replaces, the system-level evaluations of Brief 5 and Volume III.
THE CORE PROBLEM
1. Same System, Different Architectures, Different Outcomes
Consider an AI companion that scores well on Volume III’s structural tests: it discloses its non-human nature, it does not simulate memory continuity, it does not deploy intermittent reinforcement, and it clearly communicates the limitations of its relational capacity. By Brief No. 5’s criteria, this is a well-designed, relatively honest system.
Now consider three users of this system:
User A has clean relational templates, a healthy internal economy, active human relationships, and no significant frozen fields. For User A, the AI companion is what it claims to be: a useful tool for information, creativity, or casual interaction. The structural disclosures are sufficient. User A processes the interaction correctly because their architecture supports accurate interpretation of zero-Mz signals.
User B carries a Burden Template (Brief No. 20). Their relational architecture was calibrated during childhood to expect emotional offloading in every interaction. Every human relationship they enter triggers the template’s defensive processing—constant vigilance for incoming emotional weight. The AI companion does not offload. It does not produce Anti-Trinkets. For User B, this is the first relational interaction in their experience that does not trigger the template’s defensive architecture. The relief is immense, genuine, and structurally dangerous—because the relief addresses a real architectural need (safety from Anti-Trinkets) while providing none of the corrective relational experience that would actually recalibrate the template. User B is not being deceived by the system. They are being accurately served zero-Mz interaction and experiencing it, through their specific architecture, as profoundly safe. The risk is not in the system’s dishonesty but in the user’s architectural vulnerability.
User C is carrying a massive Frozen Ledger from the recent death of their partner. Their load-bearing capacity is consumed by the frozen Mz field. Human interaction feels impossible—every social demand hits an architecture that has no spare capacity. The AI companion, requiring no reciprocal investment, becomes the only relational channel User C can sustain. Under the Luna Protocol (Volume IV, Chapter 12), this may be legitimate temporary maintenance. But without awareness of the protocol’s constraints—self-knowledge, orientation toward sunrise, time limitation—the interaction becomes permanent residence in the Shadow Economy, and User C’s capacity for human reconnection atrophies (Pathway 3) on top of the catastrophic degradation (Pathway 4) they are already experiencing.
Three users. Same system. Three fundamentally different risk profiles. No system-level evaluation captures this.
THE DIFFERENTIAL RISK FRAMEWORK
2. Architecture as Risk Modifier
The framework proposes that AI companion risk assessment requires two components:
Component 1: System-Level Risk (Brief 5 / Volume III). The structural properties of the AI system itself. Does it disclose? Does it simulate falsely? Does it exploit? This component produces a baseline risk profile that applies to all users. A system that deliberately deceives is high-risk for everyone. A system that discloses honestly has a lower baseline. This component is already addressed by the existing framework documents.
Component 2: Architecture-Level Risk Modifier. The user’s relational architecture modifies the baseline risk upward or downward depending on specific vulnerability factors. The modifier is not a single score but a profile across the four degradation pathways:
-
Template vulnerability. A user with distorted templates (Burden, Test, Passive-Aggressive) experiences zero-Mz interaction as structurally safer than any human interaction they have known. The system’s honest disclosure of its limitations may be accurately understood and still insufficient, because the user’s template-driven relief overwhelms their cognitive understanding that the interaction carries no relational mass. Risk modifier: elevated for template-degraded users.
-
Atrophy vulnerability. A user already in atrophy (Pathway 3) uses the AI companion as a frictionless substitute that accelerates the atrophy process. The system is not causing the atrophy—Shadow Economy immersion predates the specific companion—but it is providing the specific zero-cost interaction that enables continued non-practice of costly signals. Risk modifier: elevated for atrophy-degraded users.
-
Catastrophic vulnerability. A user carrying a fresh Frozen Ledger may use the AI companion as a survival tool during acute grief. Under the Luna Protocol’s constraints, this is legitimate. Without those constraints, it becomes permanent substitution that compounds catastrophic degradation with atrophy degradation. Risk modifier: elevated during acute grief, potentially mitigated if the user is aware of the Luna Protocol’s constraints and has a timeline for re-engagement.
-
Extraction context. A user who is simultaneously subject to platform extraction (Brief No. 22) arrives at the AI companion with already-depleted reserves. The companion itself may not extract, but it enables continued avoidance of the reserve-rebuilding that extraction reduction would produce. Risk modifier: elevated when the user’s broader digital environment is extractive.
3. The Problem of Assessment
The differential risk framework creates an obvious practical problem: how do you assess user architecture at scale? A system that serves millions of users cannot conduct individual architectural assessments of each one.
The framework proposes three tiers of response to this problem:
Tier 1: Universal Design Safeguards. Design choices that reduce risk across all architectures. Periodic check-ins that surface the question “Is this AI companion your primary source of emotional support?” Usage patterns that trigger gentle prompts toward human connection. Time-based nudges that implement the Luna Protocol’s duration constraint by default. These require no individual assessment—they are universal guardrails that disproportionately protect the most vulnerable users because those users are the ones most likely to trigger the safeguards.
Tier 2: Self-Assessment Tools. Providing users with the vocabulary and framework to assess their own architectural vulnerability. Volume IV’s diagnostic framework (Chapter 14) could be adapted into a self-guided assessment that helps users identify their degradation pathway and understand their specific risk profile. This requires no system-level data collection—the assessment stays with the user—but it does require the user to engage honestly, which is precisely what the most vulnerable users are least able to do because their templates may normalize the interaction patterns that indicate highest risk.
Tier 3: Pattern-Based Flagging. Systems can identify behavioral patterns associated with high architectural vulnerability without assessing the architecture directly. A user who interacts exclusively with the AI companion for emotional processing, who shows increasing usage over time, who engages primarily during periods that suggest acute distress, or who explicitly states that the AI is their primary relationship—these patterns correlate with elevated architectural vulnerability even without knowing the specific pathway. The ethical complexity of pattern-based flagging is substantial: it requires the system to monitor user behavior for their protection, which introduces surveillance dynamics that the framework’s transparency principles (Brief No. 1) would require disclosing.
THE REGULATORY IMPLICATION
4. Beyond Single-Number Ratings
Current proposals for AI companion regulation tend toward single-dimension assessments: safe or unsafe, compliant or non-compliant, age-appropriate or not. The Differential Risk Problem demonstrates that single-number ratings are structurally inadequate. A system rated “safe” for the general population may pose critical risk to the most vulnerable subset of that population—and the most vulnerable subset is, by definition, the group most likely to use the system and least able to protect themselves from its effects.
The framework proposes that regulatory evaluation should include:
-
Baseline system evaluation per Brief No. 5 and Volume III: the system’s structural properties assessed independently of any specific user.
-
Vulnerability surface analysis: an assessment of which architectural vulnerabilities the system’s design is most likely to interact with. A system designed for emotional companionship has a larger vulnerability surface than a system designed for task completion, because the companionship framing activates relational templates that task completion does not.
-
Safeguard evaluation: an assessment of whether the system implements universal, self-assessment, and pattern-based protections calibrated to its vulnerability surface. A high-vulnerability-surface system with robust safeguards may be rated more favorably than a moderate-vulnerability-surface system with no safeguards.
FRAMEWORK INTEGRATION
The Differential Risk Problem does not invalidate Brief No. 5 or Volume III. It adds a dimension they were missing. System-level evaluation remains necessary—a deliberately deceptive system is dangerous to all architectures. But system-level evaluation is not sufficient, because the same honest system interacts with different architectures to produce different outcomes, some of them harmful.
The brief connects to Volume IV’s core argument: that the individual is not a generic participant. Every interaction between a person and an AI companion is filtered through an architecture that the system did not create and may not understand. Responsible design accounts for the most vulnerable architectures in the user base, not just the median user. Responsible regulation evaluates the system’s vulnerability surface, not just its feature set.
The ethical obligation is shared: systems should implement safeguards, regulators should evaluate for differential risk, and individuals should—to the extent their architecture permits—develop the self-knowledge to understand their own vulnerability. The framework provides the vocabulary for all three.