Skip to content

Add camera to record grpo training#156

Merged
yangchen73 merged 11 commits intomainfrom
yc/grpo
Mar 2, 2026
Merged

Add camera to record grpo training#156
yangchen73 merged 11 commits intomainfrom
yc/grpo

Conversation

@yangchen73
Copy link
Collaborator

Description

Add camera to record grpo training

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have run the black . command to format the code base.
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • Dependencies have been updated, if applicable.

Copilot AI review requested due to automatic review settings March 2, 2026 05:36
@yangchen73 yangchen73 merged commit 451dffc into main Mar 2, 2026
4 checks passed
@yangchen73 yangchen73 deleted the yc/grpo branch March 2, 2026 05:37
@yangchen73 yangchen73 changed the title Yc/grpo Add camera to record grpo training Mar 2, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds camera recording configuration to the CartPole GRPO training config so evaluation runs can generate videos (via the trainer event system).

Changes:

  • Switch CartPole GRPO training config device to cuda:0.
  • Add trainer.events.eval.record_camera using record_camera_data_async to save videos to ./outputs/videos/eval.
  • Adjust GRPO ent_coef from 0.001 to 0.01.
Comments suppressed due to low confidence (3)

configs/agents/rl/basic/cart_pole/train_config_grpo.json:7

  • PR description says this change is about adding camera recording, but this config also changes the default device from CPU to CUDA. If this is intentional, please update the PR description (or split into a separate PR); otherwise revert to the previous device to keep the PR scoped to camera recording.
        "seed": 42,
        "device": "cuda:0",
        "headless": true,

configs/agents/rl/basic/cart_pole/train_config_grpo.json:23

  • The new camera recording is configured under trainer.events.eval, so it will only run during evaluation episodes (per Trainer._eval_once), not during training rollouts. If the intent is to "record grpo training", consider moving/duplicating this under trainer.events.train; otherwise please clarify the PR description to explicitly say it records evaluation.
        "events": {
            "eval": {
                "record_camera": {
                    "func": "record_camera_data_async",
                    "mode": "interval",
                    "interval_step": 1,

configs/agents/rl/basic/cart_pole/train_config_grpo.json:56

  • PR description focuses on camera recording, but this change also increases algorithm.cfg.ent_coef (0.001 → 0.01), which will materially change GRPO behavior. If this tuning is intended, please document it in the PR description (or split it into a separate PR); otherwise revert so the PR stays scoped to the camera feature.
            "gamma": 0.99,
            "clip_coef": 0.2,
            "ent_coef": 0.01,
            "kl_coef": 0.0,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants