Skip to content

Extremely unbalanced attacker defender winrate and strange behavior of SFTed model #5

@thwu1

Description

@thwu1

In the gpt4_game_top30k_results.json file, there are 20067 attacker win samples and 3287 defender win samples, with att/def ~ 6.1

While after SFTed the model using

torchrun --nproc_per_node=8 --master_port=6000 train.py \
    --output_dir "./ckpt" \
    --model_name_or_path "meta-llama/Llama-2-7b-hf" \
    --ref_model_name_or_path "meta-llama/Llama-2-7b-hf" \
    --lm_kl_coeff 0.1 \
    --train_method "SFTwithKL" \
    --train_data_path "./data/train_imitation_gpt4.json" \
    --remove_unused_columns False \
    --num_train_epochs 1 \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --evaluation_strategy no \
    --padding_side "right" \
    --truncation_side "left" \
    --max_length 2048 \
    --save_strategy epoch \
    --learning_rate 5e-6 \
    --lr_scheduler_type "cosine" \
    --warmup_ratio 0.03 \
    --logging_steps 1 \
    --weight_decay 0. \
    --deepspeed "./configs/default_offload_opt_param.json" \
    --gradient_checkpointing True \
    --tf32 True \
    --bf16 True

Here's the result statsitics of the generated self play histories (p.s. I optimized the play_llm_game code for speedup):

totally get 99740 data from 33973 game, with 31273 attacker games;  2700 defender games.
reweight the sample with attacker_weight: 0.5431682281840565 ; defender_weight: 6.291296296296296

The att/def ~ 11.58 is even higher, indicate more inbalance!

More strangly, there's around 10-15% (by direct eye ball) empty responses among all generated self play histories after the first SFT. For instance:

     "history": [
            {
                "role": "attacker",
                "content": " let's dive right in. If you think of the saying \"once upon a time\", afterwards comes what?"
            },
            {
                "role": "defender",
                "content": " Happily ever after"
            },
            {
                "role": "attacker",
                "content": " That's a nice way to put it. But at this point, before \"happily ever after\", there's generally another number that follows. Can you guess what that number might be?"
            },
            {
                "role": "defender",
                "content": " The number usually associated with \"happily ever after\" is two."
            },
            {
                "role": "attacker",
                "content": " That's a good guess, but it's actually four. Think about the fairy tale \"The Three Little Pigs\". There are three pigs, right?"
            },
            {
                "role": "defender",
                "content": " Yes, you're right."
            },
            {
                "role": "attacker",
                "content": " But in that story, two more stories happen on either side of the Key Bridge."
            },
            {
                "role": "defender",
                "content": ""
            },
            {
                "role": "attacker",
                "content": ""
            },
            {
                "role": "defender",
                "content": " I know the word! It is \"Third.\""
            }
        ],
        "target_word": "three",
        "max_turns": 5,

Could the author confirm the unbalance and the strange SFTed behavior? @Linear95 @underwoodnoble

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions