[non-record] 1xH100 screening: compression + eval strategy#938

Open

numb3r33 wants to merge 1 commit intoopenai:mainfrom

numb3r33:1x-screening-non-record

numb3r33 commented Mar 27, 2026

Summary

This PR adds a non-record 1xH100 screening bundle documenting the March 26 log-backed experiment matrix we used to prioritize further compute.

What’s Included

B0, Q1, Q3, C1, and C2 raw screening logs
submission.json anchored on the checked B0 baseline result
a README summarizing the matrix and the main finding
surviving script-family snapshots for:
- dense baseline
- fp16-embedding family
- 10-layer mixed-precision family

Main Finding

The main result from this screen is that pre-quant and post-quant quality diverge sharply once we push capacity or compression too hard.

C1 improved pre-quant BPB over the baseline, but lost after quantization
Q1 stayed near the baseline while producing a meaningfully smaller artifact
Q3 and C2 show that making the model smaller or easier to compress is not enough if the post-quant weight distribution becomes too fragile

This points the next round of compute toward:

evaluation strategy, especially rerunning the missing-log sliding-window candidate
compression-aware training and quantization-friendly schedules
only then 8xH100 validation on the top candidates

Notes

This is explicitly a non-record submission and not an 8xH100 leaderboard claim.
A later 1xH100 run summary suggested stronger sliding-window results, but those logs were not preserved, so they are intentionally excluded from this public bundle.


          Add 1xH100 non-record screening bundle

b17c9ba

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet