Skip to content

Match CPython CFG normalization and jump cleanup in late codegen passes #25

@youknowone

Description

@youknowone

Summary

A large share of the remaining bytecode drift comes from control-flow graph normalization, jump threading, and
cleanup-tail layout differences. This affects multiple files, including Lib/test/test_buffer.py, Lib/test/ test_dis.py, Lib/test/test_math.py, Lib/test/test_tarfile.py, and Lib/dataclasses.py.

Evidence

Representative mismatches include:

  • POP_JUMP_IF_TRUE vs POP_JUMP_IF_FALSE polarity flips
  • FOR_ITER target drift
  • JUMP_FORWARD vs duplicated inline tail blocks
  • CPython NOP anchors vs RustPython LOAD_CONST None
  • shared exit blocks being duplicated or threaded differently from CPython

Top affected tracked files include:

  • Lib/test/test_buffer.py: 5131 diffs
  • Lib/test/test_dis.py: 3194 diffs
  • Lib/test/test_math.py: 2368 diffs
  • Lib/test/test_tarfile.py: 2149 diffs
  • Lib/dataclasses.py: 1577 diffs

Expected direction

Late CFG passes should be aligned with CPython’s codegen.c and flowgraph.c behavior, especially for:

  • conditional-chain normalization
  • jump threading
  • empty-block cleanup
  • exit duplication
  • synthetic NOP / no-location anchor handling

Likely implementation areas

  • crates/codegen/src/ir.rs
  • crates/codegen/src/compile.rs

Done when

Representative functions such as jumpy, dataclasses._asdict_inner, dataclasses._replace, tarfile
context-manager tests, and the larger test_buffer.py helpers converge to CPython bytecode structure instead
of only matching outcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions