RecoveryRace-v0
Category: Benchmark Environment
Agents: 2
Difficulty: Hard
Source: coopetition_gym/envs/benchmark_envs.py
Overview
RecoveryRace-v0 models post-crisis trust recovery between two agents after a major breach. Starting from a damaged state (low trust, high reputation damage), agents must find the optimal sequence of cooperative actions to rebuild the relationship.
This environment is specifically designed to test understanding of TR-2 trust dynamics and the mathematical constraints on recovery. The key insight is that reputation damage creates a trust ceiling that limits how much trust can be recovered.
Trust recovery constrained by reputation damage. The red dashed line shows the trust ceiling (Θ = 1 - R), which rises as reputation heals. Trust cannot exceed this ceiling regardless of cooperation level.
MARL Classification
| Property | Value |
|---|---|
| Game Type | Markov Game (2-player, general-sum) with constrained state recovery |
| Cooperation Structure | Cooperative goal (joint recovery) with individual temptation to defect |
| Observability | Full (agents observe trust, reputation damage, and ceiling) |
| Communication | Implicit (through actions only) |
| Agent Symmetry | Symmetric (identical capabilities, mutual damage) |
| Reward Structure | Mixed (integrated utility + recovery bonus) |
| Action Space | Continuous: A_i = [0, 100] |
| State Dynamics | Deterministic with ceiling constraints |
| Horizon | Extended, T = 150 (longer horizon for recovery) |
| Canonical Comparison | Constrained optimization in trust repair; cf. Kim et al. (2004) “Removing the Shadow of Suspicion” |
Formal Specification
This environment is formalized as a 2-player Markov Game with constrained recovery dynamics.
Agents
N = {1, 2} (symmetric dyad starting from damaged state)
| Property | Value | Description |
|---|---|---|
| Endowment | 100.0 | Equal for both |
| Baseline | 35.0 | 35% cooperation threshold |
| Bargaining α | 0.50 | Equal surplus sharing |
State Space
S ⊆ ℝ¹⁷ (identical structure to TrustDilemma-v0)
Critical state property: Trust ceiling $\Theta_{ij} = 1 - R_{ij}$ constrains recovery.
Action Space
$A_i = [0, 100] \subset \mathbb{R}$ for each agent
Uniaxial Treatment: This environment uses the single-dimension action space characteristic of Coopetition-Gym v1.x. Competition emerges through value capture dynamics rather than explicit competitive actions.
Initial State (Post-Crisis)
| Variable | Initial Value | Interpretation |
|---|---|---|
| Trust $\tau_{ij}(0)$ | 0.25 | Very low (crisis aftermath) |
| Reputation $R_{ij}(0)$ | 0.50 | High damage (serious past violations) |
| Trust Ceiling $\Theta_{ij}(0)$ | 0.50 | Maximum recoverable trust initially |
Key constraint: Recovery target $\tau^* = 0.90$ is initially impossible ($\Theta = 0.50$). Agents must wait for reputation decay.
Transition Dynamics
Trust Update (standard TR-2):
τ_ij(t+1) = clip(τ_ij(t) + Δτ_ij, 0, Θ_ij)
Reputation Decay (slow healing):
R_ij(t+1) = R_ij(t) · (1 - $\delta_R$)
where $\delta_R = 0.01$ (1% decay per step).
Ceiling Evolution:
Θ_ij(t) = 1 - R_ij(t)
Time to reach ceiling = 0.90:
- Need $R_{ij} \leq 0.10$
- From $R = 0.50$, requires ~40 steps of pure decay
Trust Parameters (Recovery-Specific)
| Parameter | Symbol | Value | Note |
|---|---|---|---|
| Trust Building | λ⁺ | 0.08 | Slow (deliberate recovery) |
| Trust Erosion | λ⁻ | 0.35 | Fast (re-violation costly) |
| New Violation Damage | $\mu_R$ | 0.70 | Severe (trust fragile) |
| Reputation Decay | $\delta_R$ | 0.01 | Very slow (patience required) |
Reward Function
Standard integrated utility with moderate interdependence:
r_i = π_i + 0.55 · π_j
Episode Structure
- Horizon: T = 150 steps (extended for recovery)
- Success Termination: mean(τ) ≥ 0.90 (recovery achieved)
- Failure Termination: mean(τ) < 0.05 (trust collapse)
- Truncation: t ≥ T (recovery not achieved in time)
Recovery Target
- Goal: mean(τ) ≥ 0.90
- Minimum time: ~40 steps (ceiling constraint)
- Typical successful recovery: 80-120 steps
Game-Theoretic Background
Post-Crisis Dynamics
Real-world examples:
- Business partners after contract violation
- Firms after a major scandal
- Countries after diplomatic crisis
The Recovery Challenge
After a crisis:
- Trust is low: Recent violations have eroded confidence
- Reputation is damaged: Past behavior creates a ceiling on recovery
- Recovery is slow: Trust builds slowly but can erode quickly
- Patience required: Reputation damage decays very slowly
Mathematical Constraints
The trust ceiling is:
\[\Theta_{ij} = 1 - R_{ij}\]Where $R_{ij}$ is reputation damage. If $R = 0.50$, trust cannot exceed 0.50 until reputation heals.
Environment Specification
Basic Usage
import coopetition_gym
import numpy as np
# Create environment
env = coopetition_gym.make("RecoveryRace-v0")
obs, info = env.reset(seed=42)
print(f"Starting trust: {info['mean_trust']:.2f}")
print(f"Starting reputation damage: {info['mean_reputation_damage']:.2f}")
print(f"Trust ceiling: {1 - info['mean_reputation_damage']:.2f}")
print(f"Recovery target: 0.90")
# Run recovery attempt
for step in range(150):
# Consistent high cooperation for recovery
actions = np.array([80.0, 80.0]) # 80% cooperation
obs, rewards, terminated, truncated, info = env.step(actions)
if terminated: print(f"Recovery achieved at step {step}!")
break
if not terminated: print(f"Final trust: {info['mean_trust']:.3f}")
print(f"Recovery target not reached")
Parameters
| Parameter | Default | Description |
|---|---|---|
max_steps |
150 | Extended horizon for recovery |
initial_trust |
0.25 | Post-crisis low trust |
initial_reputation_damage |
0.50 | High starting damage |
recovery_target |
0.90 | Trust level to reach |
render_mode |
None | Rendering mode |
Initial State
Crisis Starting Point
| Metric | Value | Interpretation |
|---|---|---|
| Initial Trust | 0.25 | Very low (crisis aftermath) |
| Reputation Damage | 0.50 | High (serious past violations) |
| Trust Ceiling | 0.50 | Cannot exceed until rep heals |
The Ceiling Problem
With initial reputation damage of 0.50:
- Trust ceiling = 1 - 0.50 = 0.50
- Target of 0.90 is IMPOSSIBLE initially
- Must wait for reputation to decay
Trust Dynamics
Parameters
| Parameter | Symbol | Value | Description |
|---|---|---|---|
| Trust Building Rate | λ⁺ | 0.08 | Slow recovery |
| Trust Erosion Rate | λ⁻ | 0.35 | Re-violation is costly |
| Reputation Damage | $\mu_R$ | 0.70 | New violations add significant damage |
| Reputation Decay | $\delta_R$ | 0.01 | Very slow healing |
| Interdependence Amp. | ξ | 0.40 | Moderate amplification |
| Signal Sensitivity | κ | 1.0 | Standard sensitivity |
Key Insight: Patience
With $\delta_R = 0.01$, reputation heals by ~1% per step.
To reduce reputation from 0.50 to 0.10 (ceiling = 0.90):
- Need: 0.50 → 0.10 = reduction of 0.40
- At 1% per step: ~40 steps minimum
- Plus trust building time
Recovery is mathematically constrained to be slow.
Termination Conditions
Success (Termination)
Episode terminates successfully when:
if mean_trust >= recovery_target: # 0.90
terminated = True
# Recovery achieved!
Time Limit (Truncation)
Episode truncates at max_steps (150) if target not reached.
Re-Violation (Termination)
If agents defect and trust collapses below 0.05:
if mean_trust < 0.05: terminated = True
# Recovery failed - relationship ended
Interdependence Structure
Symmetric Dependencies
D = [[ 0.00, 0.55 ],
[ 0.55, 0.00 ]]
Both agents are moderately dependent on each other, creating mutual incentive for recovery.
Value Function
Parameters
| Parameter | Value | Description |
|---|---|---|
| θ | 20.0 | Standard logarithmic scale |
| γ | 0.60 | Moderate complementarity |
Metrics and Info
The info dictionary includes:
| Key | Type | Description |
|---|---|---|
step |
int | Current timestep |
mean_trust |
float | Average trust level |
mean_reputation_damage |
float | Average reputation damage |
trust_ceiling |
float | 1 - mean_reputation_damage |
recovery_progress |
float | trust / recovery_target |
peak_trust |
float | Highest trust achieved |
recovery_step |
int | Step when target first reached (or None) |
Optimal Recovery Strategy
Phase 1: Wait for Ceiling (Steps 0-30)
- Cooperate moderately (60-70%)
- Don’t waste resources on impossible trust gains
- Let reputation decay raise the ceiling
Phase 2: Aggressive Building (Steps 30-80)
- Cooperate highly (80-90%)
- Trust can now grow toward ceiling
- Each cooperative step builds incrementally
Phase 3: Sustain (Steps 80+)
- Maintain moderate cooperation
- Prevent any trust erosion
- Coast to target as reputation heals
Example: Optimal Recovery Strategy
import coopetition_gym
import numpy as np
env = coopetition_gym.make("RecoveryRace-v0")
obs, info = env.reset(seed=42)
recovery_achieved = False
for step in range(150): ceiling = 1 - info['mean_reputation_damage']
current_trust = info['mean_trust']
# Phase-based strategy
if current_trust >= ceiling - 0.05:
# Near ceiling: moderate cooperation (wait for ceiling to rise)
coop_level = 0.6
elif current_trust < ceiling - 0.1:
# Below ceiling: aggressive building
coop_level = 0.85
else:
# Approaching ceiling: high cooperation
coop_level = 0.75
actions = np.array([100.0 * coop_level, 100.0 * coop_level])
obs, rewards, terminated, truncated, info = env.step(actions)
if step % 20 == 0: print(f"Step {step}: Trust={info['mean_trust']:.3f}, "
f"Ceiling={ceiling:.3f}, Progress={info['recovery_progress']:.1%}")
if terminated and info['mean_trust'] >= 0.90: print(f"\nRecovery achieved at step {step}!")
recovery_achieved = True
break
if not recovery_achieved: print(f"\nRecovery not achieved. Final trust: {info['mean_trust']:.3f}")
Research Applications
RecoveryRace-v0 is suitable for studying:
- Trust Repair: Strategies after violations
- Constrained Optimization: Planning under mathematical constraints
- Long-horizon Planning: Patience and delayed gratification
- TR-2 Dynamics: Understanding trust ceiling mechanics
- Crisis Management: Post-crisis relationship repair
Related Environments
- TrustDilemma-v0: Normal trust dynamics
- SynergySearch-v0: Another benchmark challenge
- RenaultNissan-v0: Real-world crisis recovery (crisis phase)
References
- Kim, P.H., Ferrin, D.L., Cooper, C.D., & Dirks, K.T. (2004). Removing the Shadow of Suspicion. Journal of Applied Psychology.
- Lewicki, R.J. & Bunker, B.B. (1996). Developing and Maintaining Trust in Work Relationships. Trust in Organizations.
- Pant, V. & Yu, E. (2025). Computational Foundations for Strategic Coopetition: Formalizing Trust and Reputation Dynamics. arXiv:2510.24909