top of page

Technology

ALIGNMENT IS A PRODUCT PROBLEM, NOT A RESEARCH QUESTION

BUILDING THE INFRASTRUCTURE LAYER FOR TRUSTWORTHY, HUMAN-ALIGNED AI

BEYOND SYCOPHANCY:

MODELING RELATIONAL INTEGRITY IN CONVERSATIONAL AI

Want the full technical argument?

Get the position paper in your inbox:

Better Half solves three foundational problems that break conversational AI for serious applications: sycophancy, inconsistency, and incoherent relational dynamics.

Standard LLMs are too agreeable by design: RLHF training teaches them to maximize user satisfaction, creating systems that agree when they should push back, forget what was said three turns ago, and respond to identical inputs with wildly different outputs. This makes them fundamentally unreliable for high-stakes conversations: salary negotiations where your manager dismisses you, coming-out discussions where family members resist, conflict resolution where your partner deflects and remembers every slight.

The system softens when you've earned trust, escalates when you push without building rapport, and maintains a coherent model of your relationship across the entire conversation arc. This consistency is what enables healthy relational modeling and genuine skill transfer.

Core innovation

We're building the first conversational AI where resistance, memory, and psychological coherence emerge from relationship dynamics rather than scripted responses. The system doesn't automatically agree or reinforce user input. It selects responses that model healthy dynamics and maintain relational integrity. It softens when you've earned trust, escalates when you push without building rapport, and maintains a coherent model of your relationship across the entire conversation arc. This consistency is what enables genuine skill transfer.

Technical architecture

Three-layer decision pipeline:

  • Satisficing constraints filter unsafe actions (no catastrophic trust collapse)

  • Cumulative Prospect Theory (CPT) scores remaining options with loss aversion (λ ≈ 2.25) and diminishing sensitivity (α ≈ 0.88)

  • Prelec probability weighting activates for tail risks, overweighting rare breakthroughs and disasters

 

Proactive Engagement: We're implementing the first conversational AI with authentic mutuality. A hybrid architecture combining PSI-style drive accumulation, BDI goal persistence, and FAtiMA emotional appraisal enables the system to initiate contact based on accumulated relational pressure: checking in after silence, circling back to deflected topics, initiating repair after damage.

Continuous Learning: LoRA adapters specialize response generation for different resistance textures (dismissive, doubtful, conflicted, softening, repair). Kahneman-Tversky Optimization (KTO) updates weights using CPT-transformed human utility rather than raw preferences, ensuring adaptation preserves psychological realism.

Why this matters

Standard RLHF produces sycophancy by design: models learn to agree because humans prefer agreement. For skill development, sycophancy is fatal. Our CPT-aligned decision-making naturally avoids catastrophic failures, prioritizes relationship repair over optimization, and responds to early damage with the intensity real people exhibit.

The architecture extends from consumer use cases (salary negotiation, coming-out conversations, boundary-setting) to enterprise training (sales, leadership, HR) to government applications (diplomatic simulation, interrogation resistance, de-escalation practice). One engine, multiple markets, all requiring the same core capability: realistic sparring partners for conversations that matter.

Request the full paper

The complete technical specification includes:

  • Detailed CPT implementation in sequential decision-making

  • Proactive engagement architecture with formal pressure models

  • Self-tuning calibration methodology via maximum likelihood estimation

  • Safety constraints and validation protocols

  • Comparison with BayesAct, BDI, FAtiMA, and utility AI approaches

  • 50+ pages covering mathematical foundations, implementation details, and open research questions

bottom of page