This repository contains the training dataset for the paper "VeriRole: Verifiable Role-Awareness through Hint-Guided Reinforcement Learning".
The dataset focuses on enhancing Role-Playing Conversational Agents by introducing a structured "Hint-Guided" reasoning process through reiforcement learning.
The dataset is provided in a JSON list format. Each entry represents a dialogue turn with the following structure:
| Field | Type | Description |
|---|---|---|
problem |
Object | Contains the input prompt configuration. |
problem.system_prompt |
String | The detailed instruction set, including character profile , constraints, and the requirement to use <hint> and <think> tags. |
problem.history |
List | The dialogue history (previous turns between user and assistant). |
hint |
Dict | The ground truth of hints, key refers to the source of hint. |
data_type |
String | The category of the sample (e.g., haiguitang, raiden). |
extra_info |
Dict | Other Meta Information, including the keywords of accuracy reward in Raiden samples, and verification prompt of accuracy reward in situation puzzle samples. |