Skip to content

[feat] switch create_test_model to unified single-stage AOTI export#477

Open
tiankongdeguiji wants to merge 1 commit into
alibaba:masterfrom
tiankongdeguiji:unified-aot-test-model
Open

[feat] switch create_test_model to unified single-stage AOTI export#477
tiankongdeguiji wants to merge 1 commit into
alibaba:masterfrom
tiankongdeguiji:unified-aot-test-model

Conversation

@tiankongdeguiji
Copy link
Copy Markdown
Collaborator

Summary

  • Replace legacy two-stage AOT export (split_model + export_model_aot) with unified single-stage export_unified_model_aot in create_test_model, so unit tests exercise the same fused sparse+dense AOTI compilation path used in production.
  • Remove unused split_model import from test_util.py.

Test plan

  • MultiTowerDINTest.test_multi_tower_din_3 (AOT_INDUCTOR) — passed (18s, real AOTI compilation)
  • DlrmHSTUTest.test_dlrm_hstu (Hypothesis with AOT_INDUCTOR examples) — passed (190s)
  • All 4 multi_tower_din graph type variants (NORMAL, FX_TRACE, JIT_SCRIPT, AOT_INDUCTOR) — passed
  • Ruff lint — passed

🤖 Generated with Claude Code

Replace legacy two-stage AOT export (split_model + export_model_aot) with
unified export_unified_model_aot in create_test_model, so unit tests exercise
the same one-stage fused sparse+dense AOTI compilation path used in production.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tiankongdeguiji
Copy link
Copy Markdown
Collaborator Author

Review

The change is minimal and correct: signatures align, ScriptWrapper exposes the _features / _feature_groups / set_is_inference attributes that export_unified_model_aot relies on, and removing the split_model import from test_util.py is safe (it remains in use in tzrec/utils/export_util.py:250,259). load_model_aot already auto-detects unified vs legacy via is_unified_aot_predict, so no loader change is needed.

A couple of points worth flagging:

1. Legacy two-stage AOT path becomes untested. tzrec/utils/export_util.py:254-267 keeps three production branches: TRT, unified AOT (ENABLE_AOT=2), and the legacy split_model + export_model_aot path (ENABLE_AOT=1). After this PR, TestGraphType.AOT_INDUCTOR only exercises the unified branch, so the legacy production path — and the legacy branch of load_model_aot — would no longer be covered by any unit test. If ENABLE_AOT=1 is still a supported mode, consider either parameterizing one AOT_INDUCTOR test over both modes, or filing a follow-up to retire the legacy path so the codebase doesn't carry untested production code.

2. Test plan is missing one affected test. tzrec/models/ultra_hstu_test.py (UltraHSTUTest.test_ultra_hstu) also passes TestGraphType.AOT_INDUCTOR through create_test_model and is therefore affected by this swap, but isn't listed in the PR's test plan alongside MultiTowerDIN and DlrmHSTU. Worth running and noting.

3. CI compile-time. Compiling the full sparse+dense graph through Inductor is inherently heavier than dense-only AOTI; the PR description's ~190s for DlrmHSTU is a real cost. Not blocking, but if more AOTI tests are added later, watch total CI wall time and consider gating the heaviest under a slow-test marker or sharing one compile across parameterized variants.

No security or doc concerns.

@tiankongdeguiji
Copy link
Copy Markdown
Collaborator Author

Second-pass review

Re-ran the analysis across code-quality, test-coverage, performance, docs, and security angles. The three substantive items from the prior top-level review still stand and I won't repeat them.

One small new nit worth surfacing:

  • Duplicate cuda:0 literal. test_util.py:84 passes torch.device("cuda:0") to load_model_aot, while export_unified_model_aot already hardcodes cuda:0 internally (aot_utils.py:453). Harmless, but if the unified helper ever grows a device argument, both sites should be threaded together rather than left to drift.

Wiring is otherwise clean — ScriptWrapper re-exposes _features/_feature_groups and inherits set_is_inference, so every attribute export_unified_model_aot touches resolves. No security or docs concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant