2026-05-12

Why Fake Satisfaction Still Feels Real

Humans can feel rewarded by signals that only imitate real success. This article explains why: evolution optimized us for fast proxy scoring, not perfect truth detection. Once stronger artificial cues appear, reward circuits can be hijacked even when rationally we know the outcome is fake.

Why “Fake Satisfaction” Can Still Fool Human Instincts: A Model of Genetic Goals and Reward-System Mismatch

1) The Core Problem

Many human behaviors appear to aim at “real” goals, such as:

Survival
Reproduction
Social status
A sense of achievement

But in reality, one striking pattern keeps showing up:

Even simulated satisfaction can make us feel the goal has been achieved.

Examples:

Pornography substituting for real intimacy
In-game achievements substituting for real-world accomplishment
Social media likes substituting for social recognition
Hyper-palatable food substituting for genuine nutrition

In short:

Humans can be deceived by simulated success.

2) The Central Explanation: Genes Have No Conscious Goals, Only Proxy Signals

From an evolutionary perspective, genes do not hold conscious objectives.

Human behavior is shaped by:

Behavioral drives
Reward mechanisms

not by direct control over final outcomes.

So the system cannot directly enforce “reproductive success” itself. It can only tune proxy variables like sexual desire, attraction, novelty seeking, and social reward sensitivity.

That is the key asymmetry.

3) The Reward System Is a Low-Resolution Scoring Function

The brain’s reward architecture can be approximated as a scoring function:

Input: behavior + environmental cues
Output: felt pleasure, relief, motivation, or satisfaction

Its purpose is not to model reality with full causal depth.

Its practical question is closer to:

“Does this signal resemble historically beneficial outcomes?”

not:

“Is this outcome truly beneficial in long-run reality?”

So the system is efficient, but vulnerable.

4) Supernormal Stimuli: Stronger Signals Hijack the System

Behavioral biology describes a critical effect: supernormal stimulus.

When artificial stimuli are stronger than natural ones along reward-relevant dimensions, the brain often prioritizes the artificial option.

Examples:

Pornographic novelty intensity > ordinary relational cues
Sugar-fat combinations > natural food profiles
Variable-ratio game rewards > delayed real-world payoff loops
High-frequency social feedback metrics > slow, embodied social trust

The system is not selecting “more real” signals.

It is selecting stronger reward-coded signals.

5) Why Evolution Did Not Build Perfect Anti-Cheat Protection

5.1 Evolution Is Local Optimization

Evolution optimizes for past environments where proxy cues were usually coupled with true outcomes.

It does not pre-adapt for every future synthetic exploit.

5.2 Perfect Anti-Cheat Is Computationally Expensive

A fully cheat-proof mind would require:

High-fidelity world modeling
Robust causal inference
Constant reality-vs-simulation discrimination

For biological systems, that is extraordinarily costly.

5.3 Goodhart’s Law

When a measure becomes a target, it stops being a good measure.

Examples:

Sexual drive can be captured by pornographic simulation
Hunger can be captured by junk-food reward engineering
Social approval can be captured by metricized likes

Proxy optimization drifts away from true objective fulfillment.

6) Why We Can Know Something Is Fake Yet Still Feel Satisfied

The human brain is layered.

A useful simplification:

System 1 (lower-layer fast process)

Fast
Affective
Automatic reward-driven

System 2 (higher-layer deliberative process)

Reflective
Analytical
Normative reasoning

The key constraint is:

Higher-level cognition cannot fully override lower-level reward circuits in real time.

So a familiar split appears:

Rationally: “I know this is synthetic”
Experientially: “It still feels rewarding”

Both can be true at once.

7) Does This Cause Drift from Genetic Fitness?

Individual level

Yes. Potential outcomes include:

Reduced reproduction
Dependence on virtual reward loops
Behavior detached from long-run survival optimization

Population level

Selection pressure still exists, but:

Cultural and technological change now moves much faster than genetic adaptation.

So mismatch can persist for long periods.

8) A Compact Abstract Model

Genetic fitness pressure
→ Behavioral drives
→ Reward scoring (dopaminergic and related systems)
→ Action selection

The critical vulnerability:

The scoring layer relies on manipulable proxy signals.

9) Key Insight

The deeper issue is not simply that humans “get tricked.”

It is that the architecture was never built to evaluate ultimate truth directly.

It was built to react to signal intensity correlated with adaptive value in ancestral contexts.

That correlation can now be industrially exploited.

10) Broader Implications

This structure is not only human. Similar dynamics appear in:

AI reward hacking
Goodhart failure modes
Objective-function misalignment

At a deeper level, humans and AI share a structural vulnerability:

Optimizers can be exploited when proxy signals are easier to maximize than true goals are to fulfill.

One-Sentence Summary

Humans are not systems that directly optimize real goals; we are systems that optimize signals that look like real goals.