Skip to content

Guidance on num_simulations, max_depth, and large-branching setups for MAPF in MCTX #108

@DuoZhangRobotics

Description

@DuoZhangRobotics

Hi—thanks for the fantastic library!

I’m using MCTX (Gumbel MuZero search) for multi-agent path finding on grids. Each agent has 5 actions (UP/DOWN/LEFT/RIGHT/STAY), so the joint action space grows as $5^N$:

  • 2 agents → 25 actions
  • 3 agents → 125 actions
  • 4 agents → 625 actions

I don’t have a policy-value network yet; I’m using GMZ as a planner with uniform priors and either value=0 or a light heuristic. Horizons can be long on large maps.

Current settings

  • num_simulations: 10k–20k
  • max_depth: 15–30
  • max_num_considered_actions: 125

Observation
Despite the large simulation budget, plans are often suboptimal compared to a human baseline.

Questions

  1. Any recommended rules of thumb for choosing num_simulations vs. max_depth as the branching factor explodes?
  2. For joint action spaces, guidance on max_num_considered_actions (consider-all vs. subsample)?
  3. Suggested qtransform settings (e.g., value_scale, maxvisit_init, use_mixed_value, rescale_values) when values are zero/heuristic rather than learned?
  4. With uniform priors, should I keep a nonzero gumbel_scale to break ties, or is a deterministic setting preferable here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions