Question about baselines of RLVMR (ReAct implementation and evaluation code)

Hello, thank you for the great work on the RLVMR. I'm particularly interested in the ReAct results mentioned in the experiments (Qwen-1.5B/7B ReAct).

I noticed that your experimental setup and evaluation environment likely differ from the official ReAct implementation. Could you please clarify if the code for the prompting ReAct (without any fine-tuning) and its evaluation pipeline are included in this repository?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about baselines of RLVMR (ReAct implementation and evaluation code) #24

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about baselines of RLVMR (ReAct implementation and evaluation code) #24

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions