* Remove unused rust impl for formatting dis output
* remove `examples/dis.rs`
* Added tests
* Update lock
* Try to set snapshot dir
* Remove verbose flag
* Regenerate snapshots after #7711
* Revert "Bump insta from 1.46.3 to 1.47.2 (#7706)"
This reverts commit e6d9ea6bfe.
* Debug info
* Show diff as well
* Show debug faster
* CI: true env
* Recert CI
* Add `CI: true` in ci emv
* Reapply "Bump insta from 1.46.3 to 1.47.2 (#7706)"
This reverts commit 693ca8cbe4d7885a81162a9be31e8bb567db885a.
* simplify macro
* trim on function side
* Force insta workspace root
* fix merge
CPython convention: top-level module / interactive / expression code
does not carry CO_NEWLOCALS or CO_OPTIMIZED. The per-scope mapping at
enter_scope::CompilerScope::Module already returns empty flags, but
Compiler::new seeded the root CodeInfo with CodeFlags::NEWLOCALS,
forcing module code into the NEWLOCALS arm of frame.rs:725-731 so
locals were allocated as a fresh empty dict instead of being bound to
globals (the correct semantics for exec(code, globals)).
Restore the seed to empty() so it matches the per-scope mapping and
CPython's compiler_enter_scope for module scope.
* Align CFG cleanup bytecode with CPython
* Bytecode parity: fblock unwind, fstring join, folding, scope
- compile.rs: unwind_fblock_stack returns whether a finally ran so
return-statement emission can adjust location handling; restructure
try/except/finally cleanup to preserve or drop boundary NOPs based on
whether the body falls through; rework f-string lowering with
count/join helpers; remove the per-collection-type heuristic for
AST-level folding and defer to flowgraph passes; add several folding
helpers and a ComprehensionLoopControl enum.
- ir.rs: re-run unary/binop folding around tuple folding, add
reorder_conditional_scope_exit_and_jump_back_blocks and several block
classification helpers, add MAX_STR_SIZE, change is_exit_without_lineno
to take the block list.
- symboltable.rs: in analyze_cells, remove names owned as cells in
function-like scopes from the parent's free set; mark lambda scope
type explicitly.
* Refine CFG scope-exit backedge ordering
* typealias reviews
* Bytecode parity - try/except block order, CFG reorder
Reorder try/except/else/finally to emit else+finally before
except handlers matching CPython layout. Add set_no_location
for cleanup blocks. Extend CFG reorder pass to handle true-path
jump-back for generators, break/continue, and assert in loops.
Add stop-iteration error handler awareness to block protection.
* Align CFG cleanup bytecode with CPython
* Unmark test_dis.test_findlabels expected failure
* Compute target predecessor flags in single pass
remove_nops and remove_redundant_nops_in_blocks repeated
has_jump_predecessor / has_plain_jump_predecessor / target lookups
per block, scanning all blocks each time. With ~200,000 if blocks
this became O(B^2 * I) and timed out test_compile.test_stack_overflow.
Fold the three flags into one O(B * I) pass via
compute_target_predecessor_flags.
* Address review feedback on slice folding, fallthrough, attr-chain
- try_fold_constant_slice now delegates to try_fold_constant_expr so
slice bounds accept the same constants other folding paths do
(unary-folded values, __debug__, etc.).
- remove_nops resolves fallthrough predecessors through empty blocks
via next_nonempty_block before checking ends_with_for_cleanup.
- should_deopt_borrowed_attr_chain ReturnIter matcher now accepts
Instruction::CallKw alongside Instruction::Call, matching the
Call/CallKw treatment in the surrounding deopt trigger check.
* remove test
Constant folding:
- Add string/bytes multiply and bytes concat folding in IR
- Add constant subscript folding (str, bytes, tuple indexing)
- Delegate list/set constant folding to IR passes
- Stream big non-const list/set via BUILD+LIST_APPEND
Class/generic compilation:
- Reorder class body prologue: __type_params__ before __classdict__
- Build class function before .generic_base in generic classes
- Register .type_params/.generic_base symbols in proper scopes
- Use load_name/store_name helpers for synthetic variables
Return block handling:
- Only duplicate return-None epilogues, not arbitrary returns
- Add inline_pop_except_return_blocks pass
- Add duplicate_named_except_cleanup_returns pass
Other fixes:
- Fix eliminate_dead_stores to only collapse adjacent duplicates
- Skip STORE_FAST_LOAD_FAST superinstruction in generators after FOR_ITER
- Thread jumps through NOP-only blocks
- Transfer NOP line info to following unconditional jumps
- Extract scope_needs_conditional_annotations_cell helper
- Register __conditional_annotations__ for module future annotations
- Remove unnecessary NOP between FOR_ITER and unpack/store
by compiling loop target directly on target range
- Fix t-string compilation to match stack order: build
strings tuple first, then evaluate interpolations
- Split compile_tstring_into into collect_tstring_strings
and compile_tstring_interpolations
- Handle debug text literals and default repr conversion
for debug specifier in t-strings
- Always set bit 1 in BUILD_INTERPOLATION oparg encoding
- Add inline_with_suppress_return_blocks pass to inline return
epilogues after with-suppress cleanup sequences
- Extend duplicate_end_returns to handle conditional jumps to the
final return block, not just unconditional ones
- Process jump targets in reverse order to preserve indices
- Add extra deoptimize_store_fast_store_fast pass after superinstructions
- Add tests for listcomp cleanup tail and with-suppress tail
* Align bytecode codegen structure with CPython 3.14
* Bytecode parity - constant folding, annotation ordering, superinstruction alignment
- Add BoolOp constant folding with short-circuit semantics in compile_expression
- Add constant truthiness evaluation for assert statement optimization
- Disable const collection/boolop folding in starred unpack and assignment contexts
- Move annotation block generation after body with AnnotationsPlaceholder splicing
- Reorder insert_superinstructions to run before push_cold_blocks (matching flowgraph.c)
- Lower LOAD_CLOSURE after superinstructions to avoid false LOAD_FAST_LOAD_FAST
- Add ToBool before PopJumpIf in comparisons and chained compare cleanup blocks
- Unify annotation dict building to always use incremental BuildMap + StoreSubscr
- Add TrueDivide constant folding for integer operands
- Fold constant sets to Frozenset (not Tuple) in try_fold_constant_collection
- Add PyVmBag for frozenset constant materialization in code objects
- Add remove_redundant_const_pop_top_pairs pass and peephole const+branch folding
- Emit Nop for skipped constant expressions and constant-true asserts
- Preserve comprehension local ordering by source-order bound name collection
- Simplify annotation scanning in symboltable (remove simple-name gate)
* Fix CI regressions in marshal and fast-local ops
* impl more
* Align bytecode codegen with CPython structure
* Bytecode parity - comprehension/except scope ordering, load_fast_borrow fixes
- Reorder comprehension symbol-table walk so the outermost iterator
registers its sub_tables in the enclosing scope before the comp
scope, and rescan elt/ifs in CPython's order. Codegen peeks past the
outermost iterator's nested scopes to find the comprehension table.
- For plain try/except, emit handler sub_tables before the else block
so codegen's linear sub_table cursor stays aligned.
- Rename `collect_simple_annotations` to `collect_annotations` and
evaluate non-simple annotations during __annotate__ compilation to
preserve source-order side effects while keeping the simple-name
index stable.
- Dedupe equivalent code constants in `arg_constant` and add a
structural equality check on `CodeObject`.
- Disable LOAD_FAST_BORROW for the tail end block when a try has a
bare `except:` clause, and have `new_block` inherit the flag from
the current block.
- Remove `cfg!(debug_assertions)` guard around the
`optimize_load_fast_borrow` start-depth check so mismatches are
handled (return instead of assert) in release builds.
- Collapse nop-only blocks that precede a return epilogue and hoist
the prior line number into the next real instruction so the
line table matches.
- Unmark now-passing `test_consts_in_conditionals`,
`test_load_fast_unknown_simple`,
`test_load_fast_known_because_already_loaded`, and PEP 646 f3/f4
annotation checks.
* Bytecode parity - try/except line tracking, assert 0 shape
- In `compile_try_except`, drop the leading Nop and set the end
block's source range from the last orelse/body statement so line
events after the try fall on the right line.
- Recognise constant-false asserts as the direct-raise shape (no
ToBool/PopJumpIfFalse) and flip the test assertion accordingly.
- Extend `remove_redundant_nops_in_blocks` to also look through a
trailing nop before a return-epilogue pair (LoadConst/ReturnValue
or LoadSmallInt/ReturnValue) so the epilogue keeps the correct
line number.
- Rename `preds` to `predecessor_blocks` in the LOAD_FAST_BORROW
disable pass and add a test-only `debug_late_cfg_trace` helper.
- Regenerate the `nested_double_async_with` snapshot: the tail
reference to `stop_exc` now emits LOAD_FAST instead of
LOAD_FAST_BORROW.
* Bytecode parity - iter folding, break/continue line, cold inlining
- Fold a constant list iterable into a constant tuple in for-loop
iterable position, matching the CPython optimizer, and strip a
redundant LIST_TO_TUPLE immediately before GET_ITER in the IR
peephole pass.
- Emit a Nop at the break/continue source range before unwinding
so line events land on the break/continue statement instead of
the following instruction.
- Drop `propagate_disable_load_fast_borrow`; the forward propagation
was over-zealous and the per-block inheritance in `new_block` plus
the bare-except marker are enough.
- Relax `inline_small_or_no_lineno_blocks` so small exit blocks at
the tail of a cold block are always inlined, not just return
epilogues.
- Add codegen tests covering the LIST_TO_TUPLE/GET_ITER peephole and
the late-CFG trace helper for a for-loop list-literal iterable.
* Add InstructionMetadata::stack_effect_jump for branch stack effects
CPython's compile.c provides stack_effect(opcode, oparg, jump) where the
jump parameter selects between fallthrough and branch effects. The existing
stack_effect() only returns the fallthrough effect.
Add stack_effect_jump() that returns the branch effect. Most instructions
have identical fallthrough/branch effects; ForIter and Send are the
exceptions (ForIter: fallthrough=+1, branch=-1; Send: fallthrough=0,
branch=-1).
* apply review
- Use POP_TOP instead of POP_ITER for for-loop break/return cleanup
- Expand duplicate_end_returns to clone final return for jump predecessors
- Restrict late jump threading pass to unconditional jumps only
- Skip exception blocks in inline/reorder passes
- Simplify threaded_jump_instr NoInterrupt handling
* Fix exception handling: except* chaining, finally cleanup, RERAISE
- Align except* bytecode chaining
- Fix exception state model and finally handler cleanup
- Fix RERAISE to only pop exception, preserve values below
* Port IR optimization passes from flowgraph.c
- BUILD_TUPLE n + UNPACK_SEQUENCE n elimination
- Dead store elimination within basic blocks
- apply_static_swaps for SWAP reduction
* Add bytecode comparison and disassembly dump scripts
- compare_bytecode.py: compare CPython vs RustPython bytecode output
- dis_dump.py: extract disassembly in normalized JSON format
* Base resume context
* Fixes for api change
* Align codegen
* Align `frame.rs` to the api changes
* fix jit
* Use new oparg
* Fix doc
* let `ir` to decide exception depth
* Bytecode parity phase 3
Compiler changes:
- Emit TO_BOOL in and/or short-circuit evaluation (COPY+TO_BOOL+JUMP)
- Add module-level __conditional_annotations__ cell (PEP 649)
- Only set conditional annotations for AnnAssign, not function params
- Skip __classdict__ cell when future annotations are active
- Convert list literals to tuples in for-loop iterables
- Fix cell variable ordering: parameters first, then alphabetical
- Fix RESUME DEPTH1 flag for yield-from/await
- Don't propagate __classdict__/__conditional_annotations__ freevar
through regular functions — only annotation/type-param scopes
- Inline string compilation path
* Skip test_thread_safety in _test_multiprocessing
SIGSEGV in _finalizer_registry dict access under aggressive GC
and thread switching. Root cause is dict thread-safety in VM.
* Skip list→tuple optimization for async for; propagate future_annotations to nested scopes
* Match CPython LOAD_SPECIAL stack semantics for with/async-with
LOAD_SPECIAL now pushes (callable, self_or_null) matching CPython's
CALL convention, instead of a single bound method:
- Function descriptors: push (func, self)
- Plain attributes: push (bound, NULL)
Updated all with-statement paths:
- Entry: add SWAP 3 after SWAP 2, remove PUSH_NULL before CALL 0
- Normal exit: remove PUSH_NULL before CALL 3
- Exception handler (WITH_EXCEPT_START): read exit_func at TOS-4
and self_or_null at TOS-3
- Suppress block: 3 POP_TOPs after POP_EXCEPT (was 2)
- FBlock exit (preserve_tos): SWAP 3 + SWAP 2 rotation
- UnwindAction::With: remove PUSH_NULL
Stack effects updated: LoadSpecial (2,1), WithExceptStart (7,6)
* Normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT
Add LOAD_FAST_CHECK → LOAD_FAST and JUMP_BACKWARD_NO_INTERRUPT →
JUMP_BACKWARD to opname normalization in dis_dump.py. These are
optimization variants with identical semantics.
* Add EXTENDED_ARG to SKIP_OPS, normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT
* Remove duplicate return-None when block already has return
Skip duplicate_end_returns for blocks that already end with
LOAD_CONST + RETURN_VALUE. Run DCE + unreachable elimination
after duplication to remove the now-unreachable original return
block.
* Improve __static_attributes__ collection accuracy
- Support tuple/list unpacking targets: (self.x, self.y) = val
- Skip @staticmethod and @classmethod decorated methods
- Use scan_target_for_attrs helper for recursive target scanning
* Use method mode for function-local import attribute calls
Function-local imports (scope is Local+IMPORTED) should use method
mode LOAD_ATTR like regular names, not plain mode. Only module/class
scope imports use plain LOAD_ATTR + PUSH_NULL.
* Optimize constant iterable before GET_ITER to LOAD_CONST tuple
Convert BUILD_LIST/SET 0 + LOAD_CONST + LIST_EXTEND/SET_UPDATE + GET_ITER
to just LOAD_CONST (tuple) + GET_ITER, matching CPython's optimization
for constant list/set literals in for-loop iterables.
Also fix is_name_imported to use method mode for function-local imports,
and improve __static_attributes__ accuracy (skip @classmethod/@staticmethod,
handle tuple/list unpacking targets).
* Fix cell variable ordering: parameters first, then alphabetical
CPython orders cell variables with parameter cells first (in
parameter definition order), then non-parameter cells sorted
alphabetically. Previously all cells were sorted alphabetically.
Also add for-loop iterable optimization: constant BUILD_LIST/SET
before GET_ITER is folded to just LOAD_CONST tuple.
* Emit COPY_FREE_VARS before MAKE_CELL matching CPython order
CPython emits COPY_FREE_VARS first, then MAKE_CELL instructions.
Previously RustPython emitted them in reverse order.
* Fix RESUME AfterYield encoding to match CPython 3.14 (value 5)
CPython 3.14 uses RESUME arg=5 for after-yield, not 1.
Also reorder COPY_FREE_VARS before MAKE_CELL and fix cell
variable ordering (parameters first, then alphabetical).
* Address code review feedback from #7481
- Set is_generator flag for generator expressions in scan_comprehension
- Fix posonlyargs priority in collect_static_attributes first param
- Add match statement support to scan_store_attrs
- Fix stale decorator stack comment
- Reorder NOP removal after fold_unary_negative for better collection folding
* Fold constant list/set/tuple literals in compiler
When all elements of a list/set/tuple literal are constants and
there are 3+ elements, fold them into a single constant:
- list: BUILD_LIST 0 + LOAD_CONST (tuple) + LIST_EXTEND 1
- set: BUILD_SET 0 + LOAD_CONST (tuple) + SET_UPDATE 1
- tuple: LOAD_CONST (tuple)
This matches CPython's compiler optimization and fixes the most
common bytecode difference (92/200 sampled files).
Also add bytecode comparison scripts (dis_dump.py, compare_bytecode.py)
for systematic parity tracking.
* Use BUILD_MAP 0 + MAP_ADD for large dicts (>= 16 pairs)
Match CPython's compiler behavior: dicts with 16+ key-value pairs
use BUILD_MAP 0 followed by MAP_ADD for each pair, instead of
pushing all keys/values on the stack and calling BUILD_MAP N.
* Fix clippy warnings and cargo fmt
* fix surrogate
* Bytecode parity
Compiler changes:
- Remove PUSH_NULL from decorator cal
ls, use CALL 0
- Collect __static_attributes__ from self.xxx = patterns
- Sort __static_attributes__ alphabetically
- Move __classdict__ init before __doc__ in class prologue
- Fold unary negative constants
- Fold constant list/set literals (3+ elements)
- Use BUILD_MAP 0 + MAP_ADD for 16+ dict pairs
- Always run peephole optimizer for s
uperinstructions
- Emit RETURN_GENERATOR for generator
functions
- Add is_generator flag to SymbolTabl
e
* Fix formatting and collapsible_if clippy warnings in compile.rs
* Fix clippy, fold_unary_negative chaining, and generator line tracing
- Replace irrefutable if-let with let for ExceptHandler
- Remove folded UNARY_NEGATIVE instead of replacing with NOP,
enabling chained negation folding
- Initialize prev_line to def line for generators/coroutines
to suppress spurious LINE events from preamble instructions
- Remove expectedFailure markers for now-passing tests
* Fix JIT StoreFastStoreFast, format, and remove expectedFailure markers
- Add StoreFastStoreFast handling in JIT instructions
- Fix cargo fmt in frame.rs
- Remove 11 expectedFailure markers for async jump tests in
test_sys_settrace that now pass
* Fix peephole optimizer: use NOP replacement instead of remove()
Using remove() shifts instruction indices and corrupts subsequent
references, causing "pop stackref but null found" panics at runtime.
Replace folded/combined instructions with NOP instead, which are
cleaned up by the existing remove_nops pass.
* Revert peephole_optimize to use remove() for chaining support
NOP replacement broke chaining of peephole optimizations (e.g.
LOAD_CONST+TO_BOOL then LOAD_CONST+UNARY_NOT for 'not True').
The remove() approach is used by upstream and works correctly here;
fold_unary_negative keeps NOP replacement since it doesn't need chaining.
* Fix StoreFastStoreFast to handle NULL from LoadFastAndClear
StoreFast uses pop_value_opt() to allow NULL values from
LoadFastAndClear in inlined comprehension cleanup paths.
StoreFastStoreFast must do the same, otherwise the peephole
optimizer's fusion of two StoreFast instructions panics when
restoring unbound locals after an inlined comprehension.
* Emit TO_BOOL before conditional jumps, fix class/module prologue
- Emit TO_BOOL before POP_JUMP_IF_TRUE/FALSE in the general case
of compile_jump_if (Compare expressions excluded since they
already produce a bool)
- Module-level __doc__: use STORE_NAME instead of STORE_GLOBAL
- Class body __module__: use LOAD_NAME instead of LOAD_GLOBAL
- Class body: store __firstlineno__ before __doc__
* Emit MAKE_CELL and COPY_FREE_VARS before RESUME
Emit MAKE_CELL for each cell variable and COPY_FREE_VARS N for
free variables at the start of each code object, before RESUME.
These instructions are no-ops in the VM but align the bytecode
with CPython 3.14's output.
* Emit __static_attributes__ at end of class bodies
Store a tuple of attribute names (currently always empty) as
__static_attributes__ in the class namespace, matching CPython
3.14's class body epilogue. Attribute name collection from
self.xxx accesses is a follow-up task.
* Remove expectedFailure from DictProxyTests iter tests
test_iter_keys, test_iter_values, test_iter_items now pass
because class bodies emit __static_attributes__ and
__firstlineno__, matching the expected dict key set.
* Use 1-based stack indexing for LIST_EXTEND, SET_UPDATE, etc.
Switch LIST_APPEND, LIST_EXTEND, SET_ADD, SET_UPDATE, MAP_ADD
from 0-based to 1-based stack depth argument, matching CPython's
PEEK(oparg) convention. Adjust the VM to subtract 1 before
calling nth_value.
* Use plain LOAD_ATTR + PUSH_NULL for calls on imported names
When the call target is an attribute of an imported name (e.g.,
logging.getLogger()), use plain LOAD_ATTR (method_flag=0) with
a separate PUSH_NULL instead of method-mode LOAD_ATTR. This
matches CPython 3.14's behavior which avoids the method call
optimization for module attribute access.
* Duplicate return-None epilogue for fall-through blocks
When the last block in a code object is exactly LOAD_CONST None +
RETURN_VALUE (the implicit return), duplicate these instructions
into blocks that would otherwise fall through to it. This matches
CPython 3.14's behavior of giving each code path its own explicit
return instruction.
* Run cargo fmt on ir.rs
* Remove expectedFailure from test_intrinsic_1 in test_dis
* Emit TO_BOOL before conditional jumps for all expressions including Compare
* Add __classdict__ cell for classes with function definitions
Set needs_classdict=true for class scopes that contain function
definitions (def/async def), matching CPython 3.14's behavior for
PEP 649 deferred annotation support. Also restore the Compare
expression check in compile_jump_if to skip TO_BOOL for comparison
operations.
* Emit __classdictcell__ store in class body epilogue
Store the __classdict__ cell reference as __classdictcell__ in
the class namespace when the class has __classdict__ as a cell
variable. Uses LOAD_DEREF (RustPython separates cell vars from
fast locals unlike CPython's unified array).
* Always run DCE to remove dead code after terminal instructions
Run basic dead code elimination (truncating instructions after
RETURN_VALUE/RAISE/JUMP within blocks) at all optimization
levels, not just optimize > 0. CPython always removes this dead
code during assembly.
* Restrict LOAD_ATTR plain mode to module/class scope imports
Only use plain LOAD_ATTR + PUSH_NULL for imports at module or
class scope. Function-local imports use method call mode LOAD_ATTR,
matching CPython 3.14's behavior.
* Eliminate unreachable blocks after jump normalization
Split DCE into two phases: (1) within-block truncation after
terminal instructions (always runs), (2) whole-block elimination
for blocks only reachable via fall-through from terminal blocks
(runs after normalize_jumps when dead jump instructions exist).
* Fold BUILD_TUPLE 0 into LOAD_CONST empty tuple
Convert BUILD_TUPLE with size 0 to LOAD_CONST () during constant
folding, matching CPython's optimization for empty tuple literals.
* Handle __classcell__ and __classdictcell__ in type.__new__
- Remove __classcell__ from class dict after setting the cell value
- Add __classdictcell__ handling: set cell to class namespace dict,
then remove from class dict
- Register __classdictcell__ identifier
- Use LoadClosure instead of LoadDeref for __classdictcell__ emission
- Reorder MakeFunctionFlag bits to match CPython
- Run ruff format on scripts
* Revert __classdict__ cell and __classdictcell__ changes
The __classdict__ cell addition (for classes with function defs)
and __classdictcell__ store caused cell initialization failures
in importlib. These require deeper VM changes to properly support
the cell variable lifecycle. Reverted for stability.
* Fix unreachable block elimination with fixpoint reachability
Use fixpoint iteration to properly determine block reachability:
only mark jump targets of already-reachable blocks, preventing
orphaned blocks from falsely marking their targets as reachable.
Also add a final DCE pass after assembly NOP removal to catch
dead code created by normalize_jumps.
* Check enclosing scopes for IMPORTED flag in LOAD_ATTR mode
When deciding whether to use plain LOAD_ATTR for attribute calls,
check if the name is imported in any enclosing scope (not just
the current scope). This handles the common pattern where a module
is imported at module level but used inside functions.
* Add __classdict__ cell for classes with function definitions
Set needs_classdict=true when a class scope contains function
definitions (def/async def), matching CPython 3.14 which always
creates a __classdict__ cell for PEP 649 support in such classes.
* Store __classdictcell__ in class body epilogue
Store the __classdict__ cell reference as __classdictcell__ in
the class namespace using LoadClosure (which loads the cell
object itself, not the value inside). This matches CPython 3.14's
class body epilogue.
* Fix clippy collapsible_if warnings and cargo fmt
* Revert __classdict__ and __classdictcell__ changes (cause import failures)
* Revert type.__new__ __classcell__ removal and __classdictcell__ handling
Revert the class cell cleanup changes from e6975f973 that cause
import failures when frozen module bytecode is stale. The original
behavior (not removing __classcell__ from class dict) is restored.
* Re-add __classdict__ cell and __classdictcell__ store
Restore the __classdict__ cell for classes with function
definitions and __classdictcell__ store in class body epilogue.
Previous failure was caused by stale .pyc cache files containing
bytecode from an intermediate MakeFunctionFlag reorder attempt,
not by these changes themselves.
* Reorder MakeFunctionFlag to match CPython's SET_FUNCTION_ATTRIBUTE
Reorder discriminants: Defaults=0, KwOnlyDefaults=1, Annotations=2,
Closure=3, Annotate=4, TypeParams=5. This aligns the oparg values
with CPython 3.14's convention.
Note: after this change, stale .pyc cache files must be deleted
(find . -name '*.pyc' -delete) to avoid bytecode mismatch errors.
* Use CPython-compatible power-of-two encoding for SET_FUNCTION_ATTRIBUTE
Override From/TryFrom for MakeFunctionFlag to use power-of-two
values (1,2,4,8,16,32) matching CPython's SET_FUNCTION_ATTRIBUTE
oparg encoding, instead of sequential discriminants (0,1,2,3,4,5).
* Remove expectedFailure from test_elim_jump_after_return1 and test_no_jump_over_return_out_of_finally_block
* Remove __classcell__ and __classdictcell__ from class dict in type.__new__
* Remove expectedFailure from test___classcell___expected_behaviour, cargo fmt
* Handle MakeCell and CopyFreeVars as no-ops in JIT
These prologue instructions are handled at frame creation time
by the VM. The JIT operates on already-initialized frames, so
these can be safely skipped during compilation.
* Remove expectedFailure from test_load_fast_known_simple
* Restore expectedFailure for test_load_fast_known_simple
The test expects LOAD_FAST_BORROW_LOAD_FAST_BORROW superinstruction
which RustPython does not emit yet.
When an inlined comprehension's first iterator expression contains
nested scopes (such as a lambda), those scopes' sub_tables appear at the
current position in the parent's sub_table list. The previous code
spliced the comprehension's own child sub_tables (e.g. inner inlined
comprehensions) into that same position before compiling the iterator,
which shifted the iterator's sub_tables to wrong indices.
Move the splice after the first iterator is compiled so its sub_tables
are consumed at their original positions.
Fixes nested list comprehensions like:
```python
[[x for _, x in g] for _, g in itertools.groupby(..., lambda x: ...)]
```
Disclosure: I used AI to develop the patch though I was heavily
involved.
* Enable PEP 709 inlined comprehensions for function-like scopes
Activate the existing compile_inlined_comprehension() implementation
by fixing 6 bugs that prevented it from working:
- LoadFastAndClear: push NULL (not None) when slot is empty so
StoreFast can restore empty state after comprehension
- StoreFast: accept NULL from stack for the restore path
- sub_tables.remove(0) replaced with next_sub_table cursor to
match the pattern used elsewhere in the compiler
- in_inlined_comp flag moved from non-inlined to inlined path
- is_inlined_comprehension_context() now checks comp_inlined flag
and restricts inlining to function-like scopes
- comp_inlined set only when parent scope uses fastlocals
Symbol table analysis handles conflict detection:
- Nested scopes in comprehension → skip inlining
- Bound name conflicts with parent symbol → skip inlining
- Cross-comprehension reference conflicts → skip inlining
- Splice comprehension sub_tables into parent for nested scope tracking
* Add localspluskinds, unify DEREF to localsplus index
- Add CO_FAST_LOCAL/CELL/FREE/HIDDEN constants and
localspluskinds field to CodeObject for per-slot metadata
- Change DEREF instruction opargs from cell-relative indices
(NameIdx) to localsplus absolute indices (oparg::VarNum)
- Add fixup_deref_opargs pass in ir.rs to convert cell-relative
indices to localsplus indices after finalization
- Replace get_cell_name with get_localsplus_name in
InstrDisplayContext trait
- Update VM cell_ref/get_cell_contents/set_cell_contents to use
localsplus indices directly (no nlocals offset)
- Update function.rs cell2arg, super.rs __class__ lookup with
explicit nlocals offsets
* Fix clippy warnings, formatting, restore _opcode_metadata.py
Fix cast_possible_truncation, nonminimal_bool, collapsible_if,
manual_contains clippy lints. Restore _opcode_metadata.py to
upstream/main version (3.14 aligned).
Pre-copy closure cells in Frame::new for coroutine locals().
Handle raw values in merged cell slots during inlined comps.
Exclude async comprehensions from inlining path.
* Exclude async/await comprehensions from PEP 709 inlining in symboltable
Async comprehensions and comprehensions with await in the element
expression need their own coroutine scope and cannot be inlined.
The symboltable builder was not checking these conditions, causing
incorrect symbol scope resolution when an async comprehension was
nested inside an inlined comprehension (e.g. [[x async for x in g]
for j in items]).
- Add CO_NESTED flag (0x10) for nested function scopes
- Emit LOAD_SMALL_INT for integers 0..=255 instead of LOAD_CONST
- Eliminate dead constant expression statements (no side effects)
- Ensure None in co_consts for functions with no other constants
- Add code.__replace__() for copy.replace() support
- Mark test_co_lnotab and test_invalid_bytecode as expectedFailure
When -O flag removes assert statements, any nested scopes
(generators, comprehensions, lambdas) inside the assert
expression still have symbol tables in the sub_tables list.
Without consuming them, the next_sub_table index gets
misaligned, causing later scopes to use wrong symbol tables.
Walk the skipped assert expression with an AST visitor to
find and consume nested scope symbol tables, keeping the
index aligned with AST traversal order.
* gc: add CollectResult, stats fields, get_referrers, and fix count reset
- Add CollectResult struct with collected/uncollectable/candidates/duration
- Add candidates and duration fields to GcStats and gc.get_stats()
- Pass CollectResult to gc.callbacks info dict
- Reset generation counts for all collected generations (0..=N)
- Return 0 for third value in gc.get_threshold() (3.13+)
- Implement gc.get_referrers() by scanning all tracked objects
- Add DEBUG_COLLECTABLE output for collectable objects
- Update test_gc.py to expect candidates/duration in stats
* Update test_gc from v3.14.3
* Update test_gc.py from CPython v3.15.0a5
Taken from v3.15 (not v3.14.3) because get_stats() candidates/duration
fields were added in 3.13+ and the corresponding test assertions only
exist in 3.15.
* Fix gc_state build on wasm32: skip Instant timing
* Add candidates/duration to gc callback info, mark v3.15 test failures
* Fix gc.get_referrers to exclude executing frames, fix Future cancelled exc leak
- get_referrers: skip frame objects on the execution stack, since
they are not GC-tracked in CPython (_PyInterpreterFrame)
- _asyncio Future/Task make_cancelled_error_impl: clear the stored
cancelled exception after returning it, matching the Python
_make_cancelled_error behavior
* Fix gc.get_threshold to return actual gen2 threshold value
* Fix inconsistent GC count reset in early-return paths
Use the same reset_end formula in unreachable-empty early returns
as in the main collection path and collecting-empty path.
* Accept keyword arguments in socket.__init__
Use a FromArgs struct instead of a positional-only tuple so that
family, type, proto, and fileno can be passed as keyword arguments.
* Disable comp_inlined in symbol table to match compiler
The compiler does not yet implement PEP 709 inlined comprehensions
(is_inlined_comprehension_context always returns false), but the
symbol table was marking comprehensions as inlined. This mismatch
could cause comprehension-local symbols to be merged into the parent
scope while the compiler still looks them up in a separate scope.
---------
Co-authored-by: CPython Developers <>
* Fix symbol table sub_table desync for non-simple annotation targets
Non-simple annotations (subscript/attribute/parenthesized targets like
`a[0]: expr`) were scanned in the annotation scope during symbol table
analysis, creating sub_tables for any comprehensions. But codegen only
compiles simple name annotations into __annotate__, so those sub_tables
were never consumed. This caused subsequent simple annotations'
comprehension sub_tables to get the wrong index, resulting in
"the symbol 'X' must be present in the symbol table" errors.
Fix: skip entering annotation scope for non-simple annotations since
they are never compiled into __annotate__.
* Validate forbidden expressions in non-simple annotation targets
Fix cspell "desynchronize" warning and validate yield/await/named/async
comprehension expressions in non-simple annotations without creating
annotation scopes.
* Restore in_annotation flag before propagating error
* Implement LOAD_ATTR inline caching with adaptive specialization
Add type version counter (tp_version_tag) to PyType with subclass
invalidation cascade. Add cache read/write methods (u16/u32/u64)
to CodeUnits. Implement adaptive specialization in load_attr that
replaces the opcode with specialized variants on first execution:
- LoadAttrMethodNoDict: cached method lookup for slotted types
- LoadAttrMethodWithValues: cached method with dict shadow check
- LoadAttrInstanceValue: direct dict lookup skipping descriptors
Specialized opcodes guard on type_version_tag and deoptimize back
to generic LOAD_ATTR with backoff counter on cache miss.
* Add BINARY_OP and CALL adaptive specialization
BINARY_OP: Specialize int add/subtract/multiply and float
add/subtract/multiply with type guards and deoptimization.
CALL: Add func_version to PyFunction, specialize simple
function calls (CallPyExactArgs, CallBoundMethodExactArgs)
with invoke_exact_args fast path that skips FuncArgs
allocation and fill_locals_from_args.
* Lazy quickening for adaptive specialization counters
Move counter initialization from compile-time to RESUME execution,
matching CPython's _PyCode_Quicken pattern. Store counter in CACHE
entry's arg byte to preserve op=Instruction::Cache for dis/JIT.
Add PyCode.quickened flag for one-time initialization.
* Add Instruction::deoptimize() and CodeUnits::original_bytes()
- deoptimize() maps specialized opcodes back to their base adaptive variant
- original_bytes() produces deoptimized bytecode with zeroed CACHE entries
- co_code now returns deoptimized bytes, _co_code_adaptive returns current bytes
- Marshal serialization uses original_bytes() instead of raw transmute
* Fix monitoring and specialization interaction
- cache_entries() returns correct count for instrumented opcodes
- deoptimize() maps instrumented opcodes back to base
- quicken() skips adaptive counter for instrumented opcodes
- instrument_code Phase 3 deoptimizes specialized opcodes and
clears CACHE entries to prevent stale pointer dereferences
* Address review: bounds checks, UB fix, version overflow, error handling
- Add bounds checks to read_cache_u16/u32/u64
- Fix quicken() aliasing UB by using &mut directly
- Add JumpBackwardJit/JumpBackwardNoJit to deoptimize()
- Guard can_specialize_call with NEWLOCALS flag check
- Use compare_exchange_weak for version tag to prevent wraparound
- Propagate dict lookup errors in LoadAttrMethodWithValues
- Apply adaptive backoff on version tag assignment failure
- Remove duplicate imports in frame.rs
* Emit CACHE code units in bytecode to match CPython 3.14
- Add cache_entries() method to Instruction enum
- Emit CACHE code units after opcodes in finalize_code
- Handle NO_LOCATION (line=-1) in linetable for CACHE entries
- Account for CACHE entries in exception table generation
- Skip CACHE entries in VM execution loop (with jump detection)
- Handle CACHE in InstrumentedLine/InstrumentedInstruction/InstrumentedForIter/InstrumentedNotTaken
- Skip CACHE in monitoring instrumentation phases
- Update co_branches() for cache-adjusted offsets
- Restore _cache_format in Lib/opcode.py
- Remove expectedFailure from test_c_call, test_start_offset
* Use relative jump offsets and fix bytecode layout
- Convert jump arguments from absolute to relative offsets
in frame.rs, monitoring.rs, and stack_analysis
- Add jump_relative_forward/backward helpers to ExecutingFrame
- Resolve pseudo jump instructions before offset fixpoint loop
- Emit NOP for break, continue, pass to match line-tracing
- Fix async for: emit EndAsyncFor with correct target, add NotTaken
- Fix comprehension if-cleanup to use separate block
- Fix super() source range for multi-line calls
- Fix NOP removal to preserve line-marker NOPs
- Fix InstrumentedLine cache skipping after re-dispatch
- Match InstrumentedResume/YieldValue in yield_from_target
- Remove CALL_FUNCTION_EX cache entry from opcode.py
- Remove resolved expectedFailure markers
* Align CPython 3.14 LOAD_GLOBAL null-bit and RERAISE semantics
* Remove redundant CPython-referencing comments
Clean up comments that unnecessarily mention CPython per project
convention. Replace with concise descriptions of the behavior itself.
- Implement set_f_lineno with stack analysis and deferred unwinding
- Add Frame::set_lasti() for trace callback line jumps
- Implement co_branches() on code objects
- Clear _cache_format in opcode.py (no inline caches)
- Fix getattro slot inheritance: preserve native slot from inherit_slots
- Fix BRANCH_RIGHT src_offset in InstrumentedPopJumpIf*
- Move lasti increment before line event for correct f_lineno
- Skip RESUME instruction from generating line events
- Defer stack pops via pending_stack_pops/pending_unwind_from_stack
to avoid deadlock with state mutex
- Fix ForIter exhaust target in mark_stacks to skip END_FOR
- Prevent exception handler paths from overwriting normal-flow stacks
- Replace #[cfg(feature = "threading")] duplication with PyAtomic<T>
from rustpython_common::atomic (Radium-based unified API)
- Remove expectedFailure from 31 now-passing jump tests