Skip to content

Align bytecode codegen structure with CPython 3.14#7588

Merged
youknowone merged 8 commits intoRustPython:mainfrom
youknowone:bytecode-parity-3
Apr 18, 2026
Merged

Align bytecode codegen structure with CPython 3.14#7588
youknowone merged 8 commits intoRustPython:mainfrom
youknowone:bytecode-parity-3

Conversation

@youknowone
Copy link
Copy Markdown
Member

@youknowone youknowone commented Apr 13, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Safer NULL-aware value handling during frame execution to avoid runtime assumptions
    • Prevented annotation free-name leakage across scopes and fixed annotation handling for annotated assignments
    • Corrected comprehension iterator evaluation order and adjusted try/except/else handler scanning order
  • Refactor

    • Unified code-object/constant construction to a reference-preserving creation path across VM and import/tooling
  • Chores

    • Improved disassembly output normalization and locals/freevar resolution logic in tooling scripts

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 13, 2026

📝 Walkthrough

Walkthrough

Introduces a VM-scoped constant bag (PyVmBag) and new PyCode constructors used across import/marshal/compile paths; refines symbol-table annotation and comprehension scanning for PEP 649, makes StoreFastLoadFast NULL-aware, adjusts a pseudo-instruction signature, and improves disassembly tooling.

Changes

Cohort / File(s) Summary
PyVmBag & PyCode constructors
crates/vm/src/builtins/code.rs
Adds PyVmBag<'a> and PyCode::new_ref_with_bag/new_ref_from_bytecode/new_ref_from_frozen to construct VM-scoped constants.
Code-object creation sites
crates/vm/src/import.rs, crates/vm/src/stdlib/_ast.rs, crates/vm/src/stdlib/_imp.rs, crates/vm/src/stdlib/marshal.rs, crates/vm/src/vm/compile.rs
Replaces vm.ctx.new_code(...) with the new PyCode::new_ref_*() constructors and switches constant-bag usage to PyVmBag(vm) in deserialization paths.
Symbol table / PEP 649 changes
crates/codegen/src/symboltable.rs
Restricts annotation-free-variable propagation to class scopes, simplifies AnnAssign handling (simple vs non-simple targets), reorders try/except/orelse sub-table storage when finalbody is empty, and reworks comprehension iterator scanning order/handling.
Frame instruction NULL-awareness
crates/vm/src/frame.rs
StoreFastLoadFast now uses pop_value_opt() / push_value_opt() preserving None instead of expecting a present value.
Bytecode instruction signature
crates/compiler-core/src/bytecode/instruction.rs
Activates oparg: u32 parameter in PseudoInstruction::stack_effect_jump and forwards it to fallback stack_effect(oparg).
Marshal/import bag alignment
crates/vm/src/stdlib/_imp.rs, crates/vm/src/stdlib/marshal.rs
Switches marshal/import constant bag impls to use PyVmBag(vm) and constructs code objects with PyCode::new_ref_with_bag.
Minor edits
crates/vm/src/stdlib/_ctypes/simple.rs, crates/vm/src/types/slot.rs
Small conditional-syntax simplification and comment relocation; no behavioral change.
Disassembly tooling
scripts/dis_dump.py
Adds deterministic frozenset({...}) argrepr normalization, refactors locals/freevar resolution into _resolve_localsplus_name(), and updates jump-target heuristics (includes END_ASYNC_FOR).

Sequence Diagram(s)

sequenceDiagram
    participant Importer
    participant Marshal
    participant VM
    participant PyCode
    Importer->>Marshal: read frozen bytes / marshal deserialize
    Marshal->>VM: request ConstantBag (PyVmBag(vm))
    VM->>PyCode: PyCode::new_ref_from_frozen(vm, frozen.code)
    PyCode-->>VM: PyCode ref (with VM bag)
    VM-->>Importer: import_code_obj(PyCode)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I hopped through bags and bytecode bright,

constants found their VM-light,
annotations stay where classes dwell,
frames now carry None as well,
a tiny hop toward code that's right.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective of the pull request: aligning RustPython's bytecode codegen structure with CPython 3.14, which is demonstrated across all modified files implementing PEP 649 annotations, comprehension semantics, and code object construction changes.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 14, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

(module 'compile test_grammar test_patma test_peepholer test_pep646_syntax test_sys_settrace' not found)

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone force-pushed the bytecode-parity-3 branch 12 times, most recently from b3f1187 to 1dadb52 Compare April 17, 2026 10:54
…tion alignment

- Add BoolOp constant folding with short-circuit semantics in compile_expression
- Add constant truthiness evaluation for assert statement optimization
- Disable const collection/boolop folding in starred unpack and assignment contexts
- Move annotation block generation after body with AnnotationsPlaceholder splicing
- Reorder insert_superinstructions to run before push_cold_blocks (matching flowgraph.c)
- Lower LOAD_CLOSURE after superinstructions to avoid false LOAD_FAST_LOAD_FAST
- Add ToBool before PopJumpIf in comparisons and chained compare cleanup blocks
- Unify annotation dict building to always use incremental BuildMap + StoreSubscr
- Add TrueDivide constant folding for integer operands
- Fold constant sets to Frozenset (not Tuple) in try_fold_constant_collection
- Add PyVmBag for frozenset constant materialization in code objects
- Add remove_redundant_const_pop_top_pairs pass and peephole const+branch folding
- Emit Nop for skipped constant expressions and constant-true asserts
- Preserve comprehension local ordering by source-order bound name collection
- Simplify annotation scanning in symboltable (remove simple-name gate)
…ow fixes

- Reorder comprehension symbol-table walk so the outermost iterator
  registers its sub_tables in the enclosing scope before the comp
  scope, and rescan elt/ifs in CPython's order. Codegen peeks past the
  outermost iterator's nested scopes to find the comprehension table.
- For plain try/except, emit handler sub_tables before the else block
  so codegen's linear sub_table cursor stays aligned.
- Rename `collect_simple_annotations` to `collect_annotations` and
  evaluate non-simple annotations during __annotate__ compilation to
  preserve source-order side effects while keeping the simple-name
  index stable.
- Dedupe equivalent code constants in `arg_constant` and add a
  structural equality check on `CodeObject`.
- Disable LOAD_FAST_BORROW for the tail end block when a try has a
  bare `except:` clause, and have `new_block` inherit the flag from
  the current block.
- Remove `cfg!(debug_assertions)` guard around the
  `optimize_load_fast_borrow` start-depth check so mismatches are
  handled (return instead of assert) in release builds.
- Collapse nop-only blocks that precede a return epilogue and hoist
  the prior line number into the next real instruction so the
  line table matches.
- Unmark now-passing `test_consts_in_conditionals`,
  `test_load_fast_unknown_simple`,
  `test_load_fast_known_because_already_loaded`, and PEP 646 f3/f4
  annotation checks.
- In `compile_try_except`, drop the leading Nop and set the end
  block's source range from the last orelse/body statement so line
  events after the try fall on the right line.
- Recognise constant-false asserts as the direct-raise shape (no
  ToBool/PopJumpIfFalse) and flip the test assertion accordingly.
- Extend `remove_redundant_nops_in_blocks` to also look through a
  trailing nop before a return-epilogue pair (LoadConst/ReturnValue
  or LoadSmallInt/ReturnValue) so the epilogue keeps the correct
  line number.
- Rename `preds` to `predecessor_blocks` in the LOAD_FAST_BORROW
  disable pass and add a test-only `debug_late_cfg_trace` helper.
- Regenerate the `nested_double_async_with` snapshot: the tail
  reference to `stop_exc` now emits LOAD_FAST instead of
  LOAD_FAST_BORROW.
@youknowone youknowone force-pushed the bytecode-parity-3 branch 3 times, most recently from 5b7252c to 34b130d Compare April 17, 2026 14:41
@youknowone youknowone marked this pull request as ready for review April 17, 2026 14:41
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/dis_dump.py`:
- Around line 99-106: The frozenset branch only canonicalizes top-level
frozensets and leaves nested frozensets inside tuples/lists/dicts with
interpreter-dependent ordering; modify the logic in the frozenset handling (the
block that reads argrepr, calls ast.literal_eval into values, and returns
frozenset(...)) to recursively normalize the parsed literal structure before
re-rendering: write a small helper (or reuse _normalize_argrepr recursively)
that walks the evaluated value (handling frozenset, set, tuple, list, dict) and
returns a canonical Python structure where frozensets/sets are converted to
sorted lists or canonicalized representations, then use that normalized
structure to build the final string (the code that currently builds parts and
returns f"frozenset({{{...}}})"). Ensure the helper is applied to each element
in tuples/lists/dicts so nested frozensets are consistently ordered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: d31b5f9c-f0b9-472f-821f-6f65d8b118b5

📥 Commits

Reviewing files that changed from the base of the PR and between 4f1cf6d and 34b130d.

⛔ Files ignored due to path filters (12)
  • Lib/test/test_compile.py is excluded by !Lib/**
  • Lib/test/test_grammar.py is excluded by !Lib/**
  • Lib/test/test_patma.py is excluded by !Lib/**
  • Lib/test/test_peepholer.py is excluded by !Lib/**
  • Lib/test/test_pep646_syntax.py is excluded by !Lib/**
  • Lib/test/test_sys_settrace.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__bare_function_annotations_check_attribute_and_subscript_expressions.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__constant_true_if_pass_keeps_line_anchor_nop.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ors.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (14)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/codegen/src/symboltable.rs
  • crates/compiler-core/src/bytecode/instruction.rs
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/import.rs
  • crates/vm/src/stdlib/_ast.rs
  • crates/vm/src/stdlib/_ctypes/simple.rs
  • crates/vm/src/stdlib/_imp.rs
  • crates/vm/src/stdlib/marshal.rs
  • crates/vm/src/types/slot.rs
  • crates/vm/src/vm/compile.rs
  • scripts/dis_dump.py

Comment thread scripts/dis_dump.py
- Fold a constant list iterable into a constant tuple in for-loop
  iterable position, matching the CPython optimizer, and strip a
  redundant LIST_TO_TUPLE immediately before GET_ITER in the IR
  peephole pass.
- Emit a Nop at the break/continue source range before unwinding
  so line events land on the break/continue statement instead of
  the following instruction.
- Drop `propagate_disable_load_fast_borrow`; the forward propagation
  was over-zealous and the per-block inheritance in `new_block` plus
  the bare-except marker are enough.
- Relax `inline_small_or_no_lineno_blocks` so small exit blocks at
  the tail of a cold block are always inlined, not just return
  epilogues.
- Add codegen tests covering the LIST_TO_TUPLE/GET_ITER peephole and
  the late-CFG trace helper for a for-loop list-literal iterable.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/dis_dump.py`:
- Around line 114-120: The current filler logic for dis._common_constants
assumes the list initially has length 5 and uses (builtins.list,
builtins.set)[len(common_constants) - 5], which yields negative or wrong indices
when the initial length differs; change the loop to explicitly append the
correct constants for the missing slots (i.e., if len(common_constants) == 5
append builtins.list, if == 6 append builtins.set) until length reaches 7, then
assign back to dis._common_constants; reference symbols: _IS_RUSTPYTHON,
dis._common_constants, common_constants, builtins.list, builtins.set, and
LOAD_COMMON_CONSTANT.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 96719fbb-fbe3-4936-aad4-951698fa4c41

📥 Commits

Reviewing files that changed from the base of the PR and between 34b130d and f983099.

⛔ Files ignored due to path filters (10)
  • Lib/test/test_compile.py is excluded by !Lib/**
  • Lib/test/test_grammar.py is excluded by !Lib/**
  • Lib/test/test_patma.py is excluded by !Lib/**
  • Lib/test/test_peepholer.py is excluded by !Lib/**
  • Lib/test/test_sys_settrace.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__bare_function_annotations_check_attribute_and_subscript_expressions.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__constant_true_if_pass_keeps_line_anchor_nop.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (6)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/codegen/src/symboltable.rs
  • crates/vm/src/stdlib/_ctypes/simple.rs
  • crates/vm/src/types/slot.rs
  • scripts/dis_dump.py
✅ Files skipped from review due to trivial changes (2)
  • crates/vm/src/stdlib/_ctypes/simple.rs
  • crates/vm/src/types/slot.rs

Comment thread scripts/dis_dump.py
Comment on lines +114 to +120
if _IS_RUSTPYTHON and hasattr(dis, "_common_constants"):
common_constants = list(dis._common_constants)
while len(common_constants) < 7:
common_constants.append(
(builtins.list, builtins.set)[len(common_constants) - 5]
)
dis._common_constants = common_constants
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fragile assumption: _common_constants starts with exactly 5 entries.

The index len(common_constants) - 5 assumes the initial list length is exactly 5 (CPython's current size). If RustPython's dis._common_constants is ever shorter than 5, the first iteration uses a negative index (wrong constants appended silently); if it starts between 6 and 7, you may skip builtins.list and jump straight to builtins.set. Either case produces misleading disassembly for LOAD_COMMON_CONSTANT.

🛠️ Suggested hardening
-if _IS_RUSTPYTHON and hasattr(dis, "_common_constants"):
-    common_constants = list(dis._common_constants)
-    while len(common_constants) < 7:
-        common_constants.append(
-            (builtins.list, builtins.set)[len(common_constants) - 5]
-        )
-    dis._common_constants = common_constants
+if _IS_RUSTPYTHON and hasattr(dis, "_common_constants"):
+    common_constants = list(dis._common_constants)
+    # Ensure list/set occupy the RustPython-expected slots 5 and 6.
+    extras = [builtins.list, builtins.set]
+    for idx, value in enumerate(extras, start=5):
+        if len(common_constants) <= idx:
+            common_constants.append(value)
+        elif common_constants[idx] is not value:
+            common_constants[idx] = value
+    dis._common_constants = common_constants
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/dis_dump.py` around lines 114 - 120, The current filler logic for
dis._common_constants assumes the list initially has length 5 and uses
(builtins.list, builtins.set)[len(common_constants) - 5], which yields negative
or wrong indices when the initial length differs; change the loop to explicitly
append the correct constants for the missing slots (i.e., if
len(common_constants) == 5 append builtins.list, if == 6 append builtins.set)
until length reaches 7, then assign back to dis._common_constants; reference
symbols: _IS_RUSTPYTHON, dis._common_constants, common_constants, builtins.list,
builtins.set, and LOAD_COMMON_CONSTANT.

@youknowone youknowone merged commit 1f1be5e into RustPython:main Apr 18, 2026
21 checks passed
@youknowone youknowone deleted the bytecode-parity-3 branch April 18, 2026 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant