Skip to content

[non-record] 1xH100 screening: compression + eval strategy#938

Open
numb3r33 wants to merge 1 commit intoopenai:mainfrom
numb3r33:1x-screening-non-record
Open

[non-record] 1xH100 screening: compression + eval strategy#938
numb3r33 wants to merge 1 commit intoopenai:mainfrom
numb3r33:1x-screening-non-record

Conversation

@numb3r33
Copy link
Copy Markdown

Summary

This PR adds a non-record 1xH100 screening bundle documenting the March 26 log-backed experiment matrix we used to prioritize further compute.

What’s Included

  • B0, Q1, Q3, C1, and C2 raw screening logs
  • submission.json anchored on the checked B0 baseline result
  • a README summarizing the matrix and the main finding
  • surviving script-family snapshots for:
    • dense baseline
    • fp16-embedding family
    • 10-layer mixed-precision family

Main Finding

The main result from this screen is that pre-quant and post-quant quality diverge sharply once we push capacity or compression too hard.

  • C1 improved pre-quant BPB over the baseline, but lost after quantization
  • Q1 stayed near the baseline while producing a meaningfully smaller artifact
  • Q3 and C2 show that making the model smaller or easier to compress is not enough if the post-quant weight distribution becomes too fragile

This points the next round of compute toward:

  • evaluation strategy, especially rerunning the missing-log sliding-window candidate
  • compression-aware training and quantization-friendly schedules
  • only then 8xH100 validation on the top candidates

Notes

  • This is explicitly a non-record submission and not an 8xH100 leaderboard claim.
  • A later 1xH100 run summary suggested stronger sliding-window results, but those logs were not preserved, so they are intentionally excluded from this public bundle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant