Skip to content

Pre-trained HF Aggregation: Update run_pipeline.sh #8

@jlunder00

Description

@jlunder00

Task 6: Pipeline Script Updates

Depends on: #7 (config files exist)

Modify: /home/jlunder/research/run_pipeline.sh

New Model Types to Add

Model Type Condition Auto-Skip Stages
pretrained_text_embedding A emb contrastive, primary
pretrained_text_matching A match contrastive, primary
pretrained_noprop_embedding B emb contrastive, primary
pretrained_noprop_matching B match contrastive, primary
pretrained_tree_embedding D emb none
pretrained_tree_matching D match none
pretrained_tree_frozen_xfmr_embedding E emb none
pretrained_tree_frozen_xfmr_matching E match none
pretrained_tree_frozen_gnn_embedding F emb contrastive (needs --contrastive-checkpoint)
pretrained_tree_frozen_gnn_matching F match contrastive (needs --contrastive-checkpoint)

Changes Needed

1. get_model_family() (~line 134)

Add cases:

pretrained_text_embedding|pretrained_text_matching)
    echo "pretrained_text"
    ;;
pretrained_noprop_embedding|pretrained_noprop_matching)
    echo "pretrained_noprop"
    ;;
pretrained_tree_embedding|pretrained_tree_matching)
    echo "pretrained_tree"
    ;;
pretrained_tree_frozen_xfmr_embedding|pretrained_tree_frozen_xfmr_matching)
    echo "pretrained_tree_frozen_xfmr"
    ;;
pretrained_tree_frozen_gnn_embedding|pretrained_tree_frozen_gnn_matching)
    echo "pretrained_tree_frozen_gnn"
    ;;

2. get_model_type() (~line 198)

The existing pattern *_matching|*_matching_* and *_embedding|*_embedding_* should already handle these. Verify.

3. Config selection case (~line 450)

pretrained_text_embedding|pretrained_text_matching|\
pretrained_noprop_embedding|pretrained_noprop_matching)
    CONTRASTIVE_CONFIG=""
    ;;
pretrained_tree_embedding|pretrained_tree_matching)
    CONTRASTIVE_CONFIG="pretrained_tree_${MODEL_SUBTYPE}_contrastive_config.yaml"
    ;;
pretrained_tree_frozen_xfmr_embedding|pretrained_tree_frozen_xfmr_matching)
    CONTRASTIVE_CONFIG="pretrained_tree_frozen_xfmr_${MODEL_SUBTYPE}_contrastive_config.yaml"
    ;;
pretrained_tree_frozen_gnn_embedding|pretrained_tree_frozen_gnn_matching)
    CONTRASTIVE_CONFIG=""
    ;;

And for primary/finetune config selection (~line 536):

pretrained_text_embedding|pretrained_text_matching|\
pretrained_noprop_embedding|pretrained_noprop_matching|\
pretrained_tree_embedding|pretrained_tree_matching|\
pretrained_tree_frozen_xfmr_embedding|pretrained_tree_frozen_xfmr_matching|\
pretrained_tree_frozen_gnn_embedding|pretrained_tree_frozen_gnn_matching)
    PRIMARY_CONFIG="${MODEL_TYPE}_${TARGET_TASK}_primary_config.yaml"
    FINETUNE_CONFIG="${MODEL_TYPE}_${TARGET_TASK}_finetune_config.yaml"
    ;;

4. Auto-skip logic (add after argument parsing, ~line 393)

case "$MODEL_TYPE" in
    pretrained_text_*|pretrained_noprop_*)
        SKIP_STAGES="${SKIP_STAGES:+$SKIP_STAGES,}contrastive,primary"
        ;;
    pretrained_tree_frozen_gnn_*)
        SKIP_STAGES="${SKIP_STAGES:+$SKIP_STAGES,}contrastive"
        ;;
esac

5. Documentation block (~line 10)

Add pretrained model types to the usage comments.

Usage Examples

# Condition A (text baseline, Phase 3 only)
./run_pipeline.sh pretrained_text_embedding snli

# Condition B (raw features, Phase 3 only)
./run_pipeline.sh pretrained_noprop_embedding snli

# Condition D (full pipeline)
./run_pipeline.sh pretrained_tree_embedding snli

# Condition E (frozen transformer, full pipeline)
./run_pipeline.sh pretrained_tree_frozen_xfmr_embedding snli

# Condition F (frozen GNN, needs existing checkpoint)
./run_pipeline.sh pretrained_tree_frozen_gnn_embedding snli contrastive \
  --contrastive-checkpoint /home/jlunder/temp_temp_storage/infonce_wikiqs_20260201_234850/checkpoints/best_model.pt

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions