Skip to the content.

RecoveryRace-v0

Category: Benchmark Environment Agents: 2 Difficulty: Hard Source: coopetition_gym/envs/benchmark_envs.py


Overview

RecoveryRace-v0 models post-crisis trust recovery between two agents after a major breach. Starting from a damaged state (low trust, high reputation damage), agents must find the optimal sequence of cooperative actions to rebuild the relationship.

This environment is specifically designed to test understanding of TR-2 trust dynamics and the mathematical constraints on recovery. The key insight is that reputation damage creates a trust ceiling that limits how much trust can be recovered.

Recovery Race Trust Dynamics Trust recovery constrained by reputation damage. The red dashed line shows the trust ceiling (Θ = 1 - R), which rises as reputation heals. Trust cannot exceed this ceiling regardless of cooperation level.


MARL Classification

Property Value
Game Type Markov Game (2-player, general-sum) with constrained state recovery
Cooperation Structure Cooperative goal (joint recovery) with individual temptation to defect
Observability Full (agents observe trust, reputation damage, and ceiling)
Communication Implicit (through actions only)
Agent Symmetry Symmetric (identical capabilities, mutual damage)
Reward Structure Mixed (integrated utility + recovery bonus)
Action Space Continuous: A_i = [0, 100]
State Dynamics Deterministic with ceiling constraints
Horizon Extended, T = 150 (longer horizon for recovery)
Canonical Comparison Constrained optimization in trust repair; cf. Kim et al. (2004) “Removing the Shadow of Suspicion”

Formal Specification

This environment is formalized as a 2-player Markov Game with constrained recovery dynamics.

Agents

N = {1, 2} (symmetric dyad starting from damaged state)

Property Value Description
Endowment 100.0 Equal for both
Baseline 35.0 35% cooperation threshold
Bargaining α 0.50 Equal surplus sharing

State Space

S ⊆ ℝ¹⁷ (identical structure to TrustDilemma-v0)

Critical state property: Trust ceiling $\Theta_{ij} = 1 - R_{ij}$ constrains recovery.

Action Space

$A_i = [0, 100] \subset \mathbb{R}$ for each agent

Uniaxial Treatment: This environment uses the single-dimension action space characteristic of Coopetition-Gym v1.x. Competition emerges through value capture dynamics rather than explicit competitive actions.

Initial State (Post-Crisis)

Variable Initial Value Interpretation
Trust $\tau_{ij}(0)$ 0.25 Very low (crisis aftermath)
Reputation $R_{ij}(0)$ 0.50 High damage (serious past violations)
Trust Ceiling $\Theta_{ij}(0)$ 0.50 Maximum recoverable trust initially

Key constraint: Recovery target $\tau^* = 0.90$ is initially impossible ($\Theta = 0.50$). Agents must wait for reputation decay.

Transition Dynamics

Trust Update (standard TR-2):

τ_ij(t+1) = clip(τ_ij(t) + Δτ_ij, 0, Θ_ij)

Reputation Decay (slow healing):

R_ij(t+1) = R_ij(t) · (1 - $\delta_R$)

where $\delta_R = 0.01$ (1% decay per step).

Ceiling Evolution:

Θ_ij(t) = 1 - R_ij(t)

Time to reach ceiling = 0.90:

Trust Parameters (Recovery-Specific)

Parameter Symbol Value Note
Trust Building λ⁺ 0.08 Slow (deliberate recovery)
Trust Erosion λ⁻ 0.35 Fast (re-violation costly)
New Violation Damage $\mu_R$ 0.70 Severe (trust fragile)
Reputation Decay $\delta_R$ 0.01 Very slow (patience required)

Reward Function

Standard integrated utility with moderate interdependence:

r_i = π_i + 0.55 · π_j

Episode Structure

Recovery Target


Game-Theoretic Background

Post-Crisis Dynamics

Real-world examples:

The Recovery Challenge

After a crisis:

  1. Trust is low: Recent violations have eroded confidence
  2. Reputation is damaged: Past behavior creates a ceiling on recovery
  3. Recovery is slow: Trust builds slowly but can erode quickly
  4. Patience required: Reputation damage decays very slowly

Mathematical Constraints

The trust ceiling is:

\[\Theta_{ij} = 1 - R_{ij}\]

Where $R_{ij}$ is reputation damage. If $R = 0.50$, trust cannot exceed 0.50 until reputation heals.


Environment Specification

Basic Usage

import coopetition_gym
import numpy as np

# Create environment
env = coopetition_gym.make("RecoveryRace-v0")

obs, info = env.reset(seed=42)

print(f"Starting trust: {info['mean_trust']:.2f}")
print(f"Starting reputation damage: {info['mean_reputation_damage']:.2f}")
print(f"Trust ceiling: {1 - info['mean_reputation_damage']:.2f}")
print(f"Recovery target: 0.90")

# Run recovery attempt
for step in range(150):
    # Consistent high cooperation for recovery
    actions = np.array([80.0, 80.0])  # 80% cooperation
    obs, rewards, terminated, truncated, info = env.step(actions)

    if terminated: print(f"Recovery achieved at step {step}!")
        break

if not terminated: print(f"Final trust: {info['mean_trust']:.3f}")
    print(f"Recovery target not reached")

Parameters

Parameter Default Description
max_steps 150 Extended horizon for recovery
initial_trust 0.25 Post-crisis low trust
initial_reputation_damage 0.50 High starting damage
recovery_target 0.90 Trust level to reach
render_mode None Rendering mode

Initial State

Crisis Starting Point

Metric Value Interpretation
Initial Trust 0.25 Very low (crisis aftermath)
Reputation Damage 0.50 High (serious past violations)
Trust Ceiling 0.50 Cannot exceed until rep heals

The Ceiling Problem

With initial reputation damage of 0.50:


Trust Dynamics

Parameters

Parameter Symbol Value Description
Trust Building Rate λ⁺ 0.08 Slow recovery
Trust Erosion Rate λ⁻ 0.35 Re-violation is costly
Reputation Damage $\mu_R$ 0.70 New violations add significant damage
Reputation Decay $\delta_R$ 0.01 Very slow healing
Interdependence Amp. ξ 0.40 Moderate amplification
Signal Sensitivity κ 1.0 Standard sensitivity

Key Insight: Patience

With $\delta_R = 0.01$, reputation heals by ~1% per step.

To reduce reputation from 0.50 to 0.10 (ceiling = 0.90):

Recovery is mathematically constrained to be slow.


Termination Conditions

Success (Termination)

Episode terminates successfully when:

if mean_trust >= recovery_target:  # 0.90
    terminated = True
    # Recovery achieved!

Time Limit (Truncation)

Episode truncates at max_steps (150) if target not reached.

Re-Violation (Termination)

If agents defect and trust collapses below 0.05:

if mean_trust < 0.05: terminated = True
    # Recovery failed - relationship ended

Interdependence Structure

Symmetric Dependencies

D = [[ 0.00,  0.55 ],
     [ 0.55,  0.00 ]]

Both agents are moderately dependent on each other, creating mutual incentive for recovery.


Value Function

Parameters

Parameter Value Description
θ 20.0 Standard logarithmic scale
γ 0.60 Moderate complementarity

Metrics and Info

The info dictionary includes:

Key Type Description
step int Current timestep
mean_trust float Average trust level
mean_reputation_damage float Average reputation damage
trust_ceiling float 1 - mean_reputation_damage
recovery_progress float trust / recovery_target
peak_trust float Highest trust achieved
recovery_step int Step when target first reached (or None)

Optimal Recovery Strategy

Phase 1: Wait for Ceiling (Steps 0-30)

Phase 2: Aggressive Building (Steps 30-80)

Phase 3: Sustain (Steps 80+)


Example: Optimal Recovery Strategy

import coopetition_gym
import numpy as np

env = coopetition_gym.make("RecoveryRace-v0")
obs, info = env.reset(seed=42)

recovery_achieved = False

for step in range(150): ceiling = 1 - info['mean_reputation_damage']
    current_trust = info['mean_trust']

    # Phase-based strategy
    if current_trust >= ceiling - 0.05:
        # Near ceiling: moderate cooperation (wait for ceiling to rise)
        coop_level = 0.6
    elif current_trust < ceiling - 0.1:
        # Below ceiling: aggressive building
        coop_level = 0.85
    else:
        # Approaching ceiling: high cooperation
        coop_level = 0.75

    actions = np.array([100.0 * coop_level, 100.0 * coop_level])
    obs, rewards, terminated, truncated, info = env.step(actions)

    if step % 20 == 0: print(f"Step {step}: Trust={info['mean_trust']:.3f}, "
              f"Ceiling={ceiling:.3f}, Progress={info['recovery_progress']:.1%}")

    if terminated and info['mean_trust'] >= 0.90: print(f"\nRecovery achieved at step {step}!")
        recovery_achieved = True
        break

if not recovery_achieved: print(f"\nRecovery not achieved. Final trust: {info['mean_trust']:.3f}")

Research Applications

RecoveryRace-v0 is suitable for studying:



References

  1. Kim, P.H., Ferrin, D.L., Cooper, C.D., & Dirks, K.T. (2004). Removing the Shadow of Suspicion. Journal of Applied Psychology.
  2. Lewicki, R.J. & Bunker, B.B. (1996). Developing and Maintaining Trust in Work Relationships. Trust in Organizations.
  3. Pant, V. & Yu, E. (2025). Computational Foundations for Strategic Coopetition: Formalizing Trust and Reputation Dynamics. arXiv:2510.24909