Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

Jidong Jia, Youjian Zhang, Huan Fu, Dacheng Tao

Shanghai Jiao Tong University

Abstract

Official implementation of the paper "Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation."

Existing dance generation models typically operate on sparse skeletons and often overlook the geometric constraints of the human body, leading to artifacts such as body interpenetration and unstable foot-ground contact when visualized with full-body meshes.

In this work, we identify this critical gap between skeleton-level motion generation and mesh-level body visualization, which we refer to as the skeleton-to-mesh gap. To bridge this gap, we propose Skeleton2Stage to distill the physics-based motion priors from the physics simulator and heuristic constraints into the generative models via a reward-guided fine-tuning framework.

Experiments show that Skeleton2Stage improves the physical plausibility of generated dances and reduces common artifacts when visualized with full-body meshes.

To address this problem, Skeleton2Stage leverages a physics-based humanoid controller as a physical plausibility evaluator. The evaluator provides feedback on whether generated motions satisfy physical constraints, especially those arising from human body geometry. Together with complementary reward signals, this feedback encourages the generative model to internalize physics-aware motion priors via RLFT, producing motions that remain physically plausible when visualized with a human body mesh.

Docs

Current Results on EDGE

All evaluation is done using the mean SMPL body shape.

Installation

To create the environment, follow these instructions:

Clone the project:

git clone https://github.com/jjd1123/Skeleton2Stage.git

Create a new conda environment and install PyTorch:

conda create -n isaac python=3.8
pip install -r requirements.txt

Download and setup Isaac Gym.
Download the MuJoCo version 2.1 for Linux.
Install torch-mesh-isect for body penetration rate evaluation.
Configure your paths in environment.sh.
For a cleaner project layout, you can place Isaac Gym, MuJoCo, and torch-mesh-isect under the environment/ directory.
This repository additionally depends on the following libraries, which may require special installation procedures:

jukemirlib
pytorch3d
accelerate
- Note: after installation, don't forget to run accelerate config. We use fp16.

Place the SMPL files under body_models/ as follows:

body_models/
├── README.md            # This guide file
│
├── smpl/
│   ├── J_regressor_extra.npy
│   ├── kintree_table.pkl
│   ├── smplfaces.npy
│   ├── SMPL_FEMALE.pkl
│   ├── SMPL_MALE.pkl
│   └── SMPL_NEUTRAL.pkl
│
├── smplh/
│   ├── female/
│   │   └── model.npz
│   ├── male/
│   │   └── model.npz
│   └── neutral/
│       └── model.npz
│
└── smplx/
    ├── female/
    │   └── model.npz
    ├── male/
    │   └── model.npz
    └── neutral/
        └── model.npz

Evaluation

Evaluation Pipeline

Before evaluation, make sure you have:

(1) the correct settings in metric computation scripts, and

(2) the correct model in Line 56 in EDGE.py.

cd code/rl_finetune
bash eval.sh exp_name epoch_num motion_save_root ckpt_root cached_music_features

Evaluation on Other Models

Coming soon!

Training

Data Processing

To fine-tune the generative model, you need the following:

A base checkpoint of the generative model;
Conditioning data for training-time sampling;
A checkpoint of the trained imitation policy.

In this section, we provide data (preprocessed data and the pretrained imitation policy) for a minimal example: finetuning EDGE on AIST++. You can directly run following scripts:

cd code/rl_finetune
# download the EDGE checkpoint.
bash download_model.sh
# download preprocessed data of AIST++ and the pretrained imitation policy.
bash download_data.sh

Note: We use DVC for dataset version control. Please follow the README in the downloaded data directory for a quick setup.

We will also explain how to prepare your own datasets and pre-trained models below.

1) Pretrain a base generative model

You can follow the instructions from:

EDGE
POPDG

2) Pretrain an imitation policy

Follow the instructions in Training Imitation Policy.

3) Prepare the conditioning data

Coming soon!

Training Imitation Policy

Prepare the Expert Dataset

First, prepare the expert dataset for imitation policy training by following the guide in the Vid2player3d README.
Run the Training Script

Once the dataset is ready, start the training by executing the train.sh script.
```
cd code/pretrain/vid2player3d
bash train.sh PATH_TO_VID2PLAYER3D
```
- Customization: You can modify the training strategy by changing the configuration file and the execution order within train.sh.

RLFT for EDGE

(1) Change the weight of different rewards in reward.yaml.

(2) Set the correct model for finetuning in Line 56 in EDGE.py.

cd code/rl_finetune
bash run.sh exp_name gpu_parallel_num epoch_num batch_size

RLFT for Other Models

Coming soon!

Rendering

Coming soon!

Trouble Shooting

Supporting GPU types newer than A100

Coming soon!

Citation

If you find this work useful for your research, please cite our paper:

@misc{jia2026skeleton2stagerewardguidedfinetuningphysically,
      title={Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation}, 
      author={Jidong Jia and Youjian Zhang and Huan Fu and Dacheng Tao},
      year={2026},
      eprint={2602.13778},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.13778}, 
}

References

This repository is built on top of the following amazing repositories:

Main code framework is from: EDGE
Imitation policy is from: vid2player3d
SMPL models and layer is from: SMPL-X model
README template is from: PHC

Please follow the licenses of the above repositories for usage.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.dvc		.dvc
assets		assets
body_models		body_models
code		code
environment		environment
.dvcignore		.dvcignore
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

Abstract

Table of Contents

News

TODOs

Introduction

Docs

Current Results on EDGE

Installation

Evaluation

Evaluation Pipeline

Evaluation on Other Models

Training

Data Processing

1) Pretrain a base generative model

2) Pretrain an imitation policy

3) Prepare the conditioning data

Training Imitation Policy

RLFT for EDGE

RLFT for Other Models

Rendering

Trouble Shooting

Supporting GPU types newer than A100

Citation

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

Abstract

Table of Contents

News

TODOs

Introduction

Docs

Current Results on EDGE

Installation

Evaluation

Evaluation Pipeline

Evaluation on Other Models

Training

Data Processing

1) Pretrain a base generative model

2) Pretrain an imitation policy

3) Prepare the conditioning data

Training Imitation Policy

RLFT for EDGE

RLFT for Other Models

Rendering

Trouble Shooting

Supporting GPU types newer than A100

Citation

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages