Compare commits

...

83 Commits

Author SHA1 Message Date
copilot-swe-agent[bot]
4efa5da5f6 Simplify unicode regex call sites
Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/b894057a-9bed-4f35-8400-a5731c63602d

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
2026-04-06 02:07:56 +00:00
copilot-swe-agent[bot]
2934897035 Polish unicode follow-up review fixes
Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/1d30ae08-d8f0-431c-9299-8aea5c21f7d4

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
2026-04-05 08:54:41 +00:00
copilot-swe-agent[bot]
0a340de9c3 Finish unicode crate follow-up refactors
Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/1d30ae08-d8f0-431c-9299-8aea5c21f7d4

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
2026-04-05 08:53:23 +00:00
copilot-swe-agent[bot]
5cf1bd6667 Polish unicode regex helpers
Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/e4115dbc-b1a5-4a77-90a7-38cf9ac1cdf2

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
2026-04-05 07:45:20 +00:00
copilot-swe-agent[bot]
e968d83808 Tidy shared unicode review feedback
Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/e4115dbc-b1a5-4a77-90a7-38cf9ac1cdf2

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
2026-04-05 07:44:23 +00:00
copilot-swe-agent[bot]
67485b5b77 Extract shared unicode crate
Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/e4115dbc-b1a5-4a77-90a7-38cf9ac1cdf2

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
2026-04-05 07:39:45 +00:00
copilot-swe-agent[bot]
d3af1c54ec Initial plan 2026-04-05 07:11:29 +00:00
Copilot
eed618d858 Fix str.zfill() width calculation for non-ASCII strings (#7534) 2026-04-04 14:45:30 +09:00
Bo Maryniuk
87fc4540c4 Fix VM's infinite recursion crash with musl libc (#7558)
* Fix VM's infinite recursion crash with musl libc

* Lintfix/cleanup warnings
2026-04-04 09:46:46 +09:00
dependabot[bot]
a09afab912 Bump lodash from 4.17.23 to 4.18.1 in /wasm/demo (#7556)
Bumps [lodash](https://github.com/lodash/lodash) from 4.17.23 to 4.18.1.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/compare/4.17.23...4.18.1)

---
updated-dependencies:
- dependency-name: lodash
  dependency-version: 4.18.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-03 01:44:50 +09:00
Shahar Naveh
3d9688402a Replace unmaintained unic crates (#7555) 2026-04-03 01:43:11 +09:00
Shahar Naveh
b61dfdc534 Updtae test_optimizer.py from 3.14.3 (#7554) 2026-04-03 01:33:50 +09:00
Shahar Naveh
6d7d74cc0b Update test_opcache.py from 3.14.3 (#7553) 2026-04-03 01:33:34 +09:00
Shahar Naveh
3f49f42702 Update test_module from 3.14.3 (#7552)
* Update `test_module` from 3.14.3

* Unmark passing test
2026-04-03 01:33:19 +09:00
Shahar Naveh
5afa3493a1 Add test_perfmaps.py from 3.14.3 (#7551) 2026-04-03 00:50:25 +09:00
Shahar Naveh
1adda8a73d Update test_file & test_largefile from 3.14.3 (#7550) 2026-04-03 00:49:55 +09:00
Shahar Naveh
344b7a5abd [zizmor] ignore superfluous-actions (#7548) 2026-04-03 00:44:58 +09:00
Jeong, YunWon
d9c4c95369 fix crates (#7549) 2026-03-31 21:05:08 +09:00
Jeong, YunWon
403c2be01d Improve codegen bytecode parity (#7541)
- Add CFG block splitting, jump threading, backward jump normalization
- Add genexpr StopIteration wrapper
- Add ConstantData::Slice and constant slice folding
- Add duplicate_exits_without_lineno and Block: Clone
- Add builtin(genexpr) optimization for tuple/list/set/all/any
- Add compile_try_except_no_finally for try-except without finally
- Add module_name_declared_global_in_nested_scope
- Add constant tuple folding in try_fold_constant_expr
- Add fstring literal-only optimization and empty literal elision
- Fix duplicate_exits_without_lineno: splice new blocks into linked list
2026-03-31 15:45:18 +09:00
Shahar Naveh
5cc9eab2dd Resolve excessive-permissions warning in ci.yaml (#7547)
* Resolve `excessive-permissions` warning in `ci.yaml`

* Update .github/workflows/ci.yaml

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-03-31 14:53:08 +09:00
dependabot[bot]
b275a90cf9 Bump j178/prek-action from 2.0.0 to 2.0.1 (#7544)
Bumps [j178/prek-action](https://github.com/j178/prek-action) from 2.0.0 to 2.0.1.
- [Release notes](https://github.com/j178/prek-action/releases)
- [Commits](79f765515b...53276d8b0d)

---
updated-dependencies:
- dependency-name: j178/prek-action
  dependency-version: 2.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-30 18:38:01 +02:00
dependabot[bot]
43851c21b9 Bump minimatch and serve in /wasm/demo (#7539)
Bumps [minimatch](https://github.com/isaacs/minimatch) to 3.1.5 and updates ancestor dependency [serve](https://github.com/vercel/serve). These dependencies need to be updated together.


Updates `minimatch` from 3.1.2 to 3.1.5
- [Changelog](https://github.com/isaacs/minimatch/blob/main/changelog.md)
- [Commits](https://github.com/isaacs/minimatch/compare/v3.1.2...v3.1.5)

Updates `serve` from 14.2.5 to 14.2.6
- [Release notes](https://github.com/vercel/serve/releases)
- [Changelog](https://github.com/vercel/serve/blob/main/CHANGELOG.md)
- [Commits](https://github.com/vercel/serve/compare/v14.2.5...v14.2.6)

---
updated-dependencies:
- dependency-name: minimatch
  dependency-version: 3.1.5
  dependency-type: indirect
- dependency-name: serve
  dependency-version: 14.2.6
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 01:31:08 +09:00
Shahar Naveh
611b122ed7 Resolve template injection (#7546) 2026-03-31 01:30:41 +09:00
dependabot[bot]
1a4964b741 Bump dtolnay/rust-toolchain (#7545)
Bumps [dtolnay/rust-toolchain](https://github.com/dtolnay/rust-toolchain) from efa25f7f19611383d5b0ccf2d1c8914531636bf9 to 3c5f7ea28cd621ae0bf5283f0e981fb97b8a7af9.
- [Release notes](https://github.com/dtolnay/rust-toolchain/releases)
- [Commits](efa25f7f19...3c5f7ea28c)

---
updated-dependencies:
- dependency-name: dtolnay/rust-toolchain
  dependency-version: 3c5f7ea28cd621ae0bf5283f0e981fb97b8a7af9
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 01:30:08 +09:00
dependabot[bot]
106f1c9f37 Bump env_logger from 0.11.9 to 0.11.10 (#7543)
Bumps [env_logger](https://github.com/rust-cli/env_logger) from 0.11.9 to 0.11.10.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](https://github.com/rust-cli/env_logger/compare/v0.11.9...v0.11.10)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-version: 0.11.10
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 01:29:47 +09:00
dependabot[bot]
c45f69977b Bump schannel from 0.1.28 to 0.1.29 (#7542)
Bumps [schannel](https://github.com/steffengy/schannel-rs) from 0.1.28 to 0.1.29.
- [Release notes](https://github.com/steffengy/schannel-rs/releases)
- [Commits](https://github.com/steffengy/schannel-rs/compare/v0.1.28...v0.1.29)

---
updated-dependencies:
- dependency-name: schannel
  dependency-version: 0.1.29
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 01:29:35 +09:00
Shahar Naveh
2703f94c3e Split cargo check matrix to individual targets. Avoid cache poisoning (#7540)
* Split check matrix. Prevent cache poisoning

* Use `rustup`

* Change name

* Align cargo args
2026-03-30 23:59:48 +09:00
Shahar Naveh
9900c761ca Fix lint warnings in release.yml (#7538) 2026-03-30 18:53:38 +09:00
Shahar Naveh
959b088d25 Remove oparg builders (#7537) 2026-03-30 18:53:13 +09:00
Jeong, YunWon
1c39fdb7f9 Bytecode parity (#7536)
* Add CFG block splitting, jump threading, backward jump normalization, genexpr StopIteration wrapper

- split_blocks_at_jumps: split blocks at branch points so each has one exit
- jump_threading: thread jumps through single-jump blocks (flowgraph.c jump_thread)
- Backward conditional jump normalization: invert and create NOT_TAKEN+JUMP block
- Follow empty blocks in jump-to-return optimization (next_nonempty_block)
- Add PEP 479 StopIteration handler to compile_comprehension for generators

* Add ConstantData::Slice and constant slice folding

- Add Slice variant to ConstantData and BorrowedConstant
- Fold constant slices (x[:3], x[1:4]) into LOAD_CONST(slice(...))
- Marshal serialization/deserialization for Slice type
- Box::leak in borrow_obj_constant for PySlice roundtrip

* Add duplicate_exits_without_lineno (disabled) and Block: Clone

Prepare infrastructure for exit block duplication optimization.
Currently disabled pending stackdepth integration.

* Improve codegen bytecode parity
2026-03-30 18:50:58 +09:00
Jeong, YunWon
3706c5376e Bytecode parity (#7535)
* Add CFG block splitting, jump threading, backward jump normalization, genexpr StopIteration wrapper

- split_blocks_at_jumps: split blocks at branch points so each has one exit
- jump_threading: thread jumps through single-jump blocks (flowgraph.c jump_thread)
- Backward conditional jump normalization: invert and create NOT_TAKEN+JUMP block
- Follow empty blocks in jump-to-return optimization (next_nonempty_block)
- Add PEP 479 StopIteration handler to compile_comprehension for generators

* Add ConstantData::Slice and constant slice folding

- Add Slice variant to ConstantData and BorrowedConstant
- Fold constant slices (x[:3], x[1:4]) into LOAD_CONST(slice(...))
- Marshal serialization/deserialization for Slice type
- Box::leak in borrow_obj_constant for PySlice roundtrip

* Add ConstantData::Frozenset variant (type only, folding deferred)

Add Frozenset to ConstantData, BorrowedConstant, and marshal support.
Actual frozenset folding (BUILD_SET + CONTAINS_OP → LOAD_CONST frozenset)
requires VirtualMachine for element hashing and is deferred.

* Add duplicate_exits_without_lineno (disabled) and Block: Clone

Prepare infrastructure for exit block duplication optimization.
Currently disabled pending stackdepth integration.
2026-03-30 12:52:04 +09:00
Huy Vũ (Josh)
e6bcd64066 Validate SyntaxError details tuple shape (#7533) 2026-03-29 22:37:08 +09:00
Jeong, YunWon
2ebd7026e4 Compiler parity: docstring dedent, StopIteration wrapper, constant folding (#7530) 2026-03-29 22:36:20 +09:00
Jeong, YunWon
6826557884 ruff from fork (#7532) 2026-03-29 19:26:24 +09:00
Jeong, YunWon
1f6b4c6bf1 ruff 0.15.8 (#7531) 2026-03-29 15:21:09 +09:00
Copilot
902985def7 Fix inspect.getsource returning truncated source for multi-line function definitions (#7519)
* Initial plan

* fix: restore def-line source range before entering function scope so co_firstlineno is correct

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/94701403-2011-4525-88f1-6e06891da6a4

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* fix: remove pre-existing expectedFailure decorators from test_gettext plural form tests

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/ce27bf53-569f-45a0-ad5a-08e8f322c717

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* remove extra_tests/snippets/inspect_getsource.py (covered by test_inspect)

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/2b64da1b-8aab-4fec-8b28-3a21d46ac2f9

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
2026-03-29 00:52:34 +09:00
dependabot[bot]
90c5464901 Bump serialize-javascript and terser-webpack-plugin in /wasm/demo (#7523)
Removes [serialize-javascript](https://github.com/yahoo/serialize-javascript). It's no longer used after updating ancestor dependency [terser-webpack-plugin](https://github.com/webpack/terser-webpack-plugin). These dependencies need to be updated together.


Removes `serialize-javascript`

Updates `terser-webpack-plugin` from 5.3.16 to 5.4.0
- [Release notes](https://github.com/webpack/terser-webpack-plugin/releases)
- [Changelog](https://github.com/webpack/terser-webpack-plugin/blob/main/CHANGELOG.md)
- [Commits](https://github.com/webpack/terser-webpack-plugin/compare/v5.3.16...v5.4.0)

---
updated-dependencies:
- dependency-name: serialize-javascript
  dependency-version: 
  dependency-type: indirect
- dependency-name: terser-webpack-plugin
  dependency-version: 5.4.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-29 00:52:00 +09:00
Huy Vũ (Josh)
da440dbbbe type.__new__: preserve caller namespace when reading __qualname__ (#7524)
* type.__new__: preserve caller namespace when reading __qualname__

* type.__new__: preserve caller namespace when reading __qualname__
2026-03-29 00:51:49 +09:00
dependabot[bot]
1a9b10ece5 Bump winresource from 0.1.30 to 0.1.31 (#7522)
Bumps [winresource](https://github.com/BenjaminRi/winresource) from 0.1.30 to 0.1.31.
- [Commits](https://github.com/BenjaminRi/winresource/commits)

---
updated-dependencies:
- dependency-name: winresource
  dependency-version: 0.1.31
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-28 14:47:55 +09:00
dependabot[bot]
dd632363c8 Bump the wasmtime group with 3 updates (#7521)
Bumps the wasmtime group with 3 updates: [cranelift](https://github.com/bytecodealliance/wasmtime), [cranelift-jit](https://github.com/bytecodealliance/wasmtime) and [cranelift-module](https://github.com/bytecodealliance/wasmtime).


Updates `cranelift` from 0.129.1 to 0.130.0
- [Release notes](https://github.com/bytecodealliance/wasmtime/releases)
- [Changelog](https://github.com/bytecodealliance/wasmtime/blob/main/RELEASES.md)
- [Commits](https://github.com/bytecodealliance/wasmtime/commits)

Updates `cranelift-jit` from 0.129.1 to 0.130.0
- [Release notes](https://github.com/bytecodealliance/wasmtime/releases)
- [Changelog](https://github.com/bytecodealliance/wasmtime/blob/main/RELEASES.md)
- [Commits](https://github.com/bytecodealliance/wasmtime/commits)

Updates `cranelift-module` from 0.129.1 to 0.130.0
- [Release notes](https://github.com/bytecodealliance/wasmtime/releases)
- [Changelog](https://github.com/bytecodealliance/wasmtime/blob/main/RELEASES.md)
- [Commits](https://github.com/bytecodealliance/wasmtime/commits)

---
updated-dependencies:
- dependency-name: cranelift
  dependency-version: 0.130.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: wasmtime
- dependency-name: cranelift-jit
  dependency-version: 0.130.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: wasmtime
- dependency-name: cranelift-module
  dependency-version: 0.130.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: wasmtime
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-28 14:47:42 +09:00
Jeong, YunWon
f7556b00c1 Bytecode parity (#7514)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-28 09:19:12 +09:00
Jeong, YunWon
3dae07cd60 winapi functions (#7516) 2026-03-28 00:02:49 +09:00
Jeong, YunWon
fddd7cb690 fix cron-ci (#7483)
* fix cron-ci

* fix custom_text_test_runner
2026-03-27 23:55:30 +09:00
Shahar Naveh
410721740d Oparg resume depth (#7515)
* Base resume context

* Fixes for api change

* Align codegen

* Align `frame.rs` to the api changes

* fix jit

* Use new oparg

* Fix doc

* let `ir` to decide exception depth
2026-03-27 21:47:52 +09:00
dependabot[bot]
e3ac1bf8dc Bump node-forge from 1.3.2 to 1.4.0 in /wasm/demo (#7513)
Bumps [node-forge](https://github.com/digitalbazaar/forge) from 1.3.2 to 1.4.0.
- [Changelog](https://github.com/digitalbazaar/forge/blob/main/CHANGELOG.md)
- [Commits](https://github.com/digitalbazaar/forge/compare/v1.3.2...v1.4.0)

---
updated-dependencies:
- dependency-name: node-forge
  dependency-version: 1.4.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-27 12:44:55 +09:00
Jeong, YunWon
3a8fb76014 Bytecode parity (#7507)
* Bytecode parity phase 3

Compiler changes:
- Emit TO_BOOL in and/or short-circuit evaluation (COPY+TO_BOOL+JUMP)
- Add module-level __conditional_annotations__ cell (PEP 649)
- Only set conditional annotations for AnnAssign, not function params
- Skip __classdict__ cell when future annotations are active
- Convert list literals to tuples in for-loop iterables
- Fix cell variable ordering: parameters first, then alphabetical
- Fix RESUME DEPTH1 flag for yield-from/await
- Don't propagate __classdict__/__conditional_annotations__ freevar
  through regular functions — only annotation/type-param scopes
- Inline string compilation path

* Skip test_thread_safety in _test_multiprocessing

SIGSEGV in _finalizer_registry dict access under aggressive GC
and thread switching. Root cause is dict thread-safety in VM.

* Skip list→tuple optimization for async for; propagate future_annotations to nested scopes
2026-03-27 12:42:29 +09:00
Jeong, YunWon
a91127c91a Reorder PyNumberBinaryOp to match NB_* constants (#7512)
Align variant ordering with BinaryOperator enum and
CPython's NB_* constants from opcode.h. Divmod is placed
last as it has no corresponding NB_* constant.
2026-03-27 12:41:02 +09:00
Jeong, YunWon
af0c2526a7 Fix GC TOCTOU race in collect_inner referent traversal (#7511)
Pre-compute referent pointers once per object in step 3 and reuse
them in step 4 (BFS reachability). Previously, gc_get_referent_ptrs()
was called independently in both steps. If a dict's write lock state
changed between the two calls (e.g., held by another thread during
one traversal but not the other), the two traversals could return
different results. This caused live objects to be incorrectly
classified as unreachable and cleared by GC.
2026-03-27 12:39:45 +09:00
dependabot[bot]
f42ffd61a1 Bump strum from 0.27.2 to 0.28.0 (#7510)
Bumps [strum](https://github.com/Peternator7/strum) from 0.27.2 to 0.28.0.
- [Release notes](https://github.com/Peternator7/strum/releases)
- [Changelog](https://github.com/Peternator7/strum/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Peternator7/strum/compare/v0.27.2...v0.28.0)

---
updated-dependencies:
- dependency-name: strum
  dependency-version: 0.28.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-27 11:27:19 +09:00
dependabot[bot]
3f92c3ad1c Bump aws-lc-rs from 1.16.0 to 1.16.2 (#7509)
Bumps [aws-lc-rs](https://github.com/aws/aws-lc-rs) from 1.16.0 to 1.16.2.
- [Release notes](https://github.com/aws/aws-lc-rs/releases)
- [Commits](https://github.com/aws/aws-lc-rs/compare/v1.16.0...v1.16.2)

---
updated-dependencies:
- dependency-name: aws-lc-rs
  dependency-version: 1.16.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-27 10:55:12 +09:00
Shahar Naveh
9282a870db Unify lint CI job (#7505)
* use `prek` for unified linting

* Fix actionlint error

* Generate metadata when specific files change

* `check_redundant_patches.py` to accept glob path

* Test

* revert defective changes

* use `rustfmt` over `cargo fmt` for individual files

* debug reviewdog

* rustfmt

* Move comment to correct location

* defevtive fmt test

* Fail with reviewdog

* fix reviewdog perms

* Try to use present token

* without checks oerms

* put normal perms

* fmt
2026-03-26 21:43:34 +09:00
Shahar Naveh
7a6dbd6624 Align concurrency CI groups names (#7508) 2026-03-26 14:33:14 +09:00
dependabot[bot]
6c3dd2885d Bump picomatch from 2.3.1 to 2.3.2 in /wasm/demo (#7506)
Bumps [picomatch](https://github.com/micromatch/picomatch) from 2.3.1 to 2.3.2.
- [Release notes](https://github.com/micromatch/picomatch/releases)
- [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/picomatch/compare/2.3.1...2.3.2)

---
updated-dependencies:
- dependency-name: picomatch
  dependency-version: 2.3.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-26 12:51:49 +09:00
Jeong, YunWon
c9cfb3d606 Bytecode parity (#7504)
* Match CPython LOAD_SPECIAL stack semantics for with/async-with

LOAD_SPECIAL now pushes (callable, self_or_null) matching CPython's
CALL convention, instead of a single bound method:
- Function descriptors: push (func, self)
- Plain attributes: push (bound, NULL)

Updated all with-statement paths:
- Entry: add SWAP 3 after SWAP 2, remove PUSH_NULL before CALL 0
- Normal exit: remove PUSH_NULL before CALL 3
- Exception handler (WITH_EXCEPT_START): read exit_func at TOS-4
  and self_or_null at TOS-3
- Suppress block: 3 POP_TOPs after POP_EXCEPT (was 2)
- FBlock exit (preserve_tos): SWAP 3 + SWAP 2 rotation
- UnwindAction::With: remove PUSH_NULL

Stack effects updated: LoadSpecial (2,1), WithExceptStart (7,6)

* Normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT

Add LOAD_FAST_CHECK → LOAD_FAST and JUMP_BACKWARD_NO_INTERRUPT →
JUMP_BACKWARD to opname normalization in dis_dump.py. These are
optimization variants with identical semantics.

* Add EXTENDED_ARG to SKIP_OPS, normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT

* Remove duplicate return-None when block already has return

Skip duplicate_end_returns for blocks that already end with
LOAD_CONST + RETURN_VALUE. Run DCE + unreachable elimination
after duplication to remove the now-unreachable original return
block.

* Improve __static_attributes__ collection accuracy

- Support tuple/list unpacking targets: (self.x, self.y) = val
- Skip @staticmethod and @classmethod decorated methods
- Use scan_target_for_attrs helper for recursive target scanning

* Use method mode for function-local import attribute calls

Function-local imports (scope is Local+IMPORTED) should use method
mode LOAD_ATTR like regular names, not plain mode. Only module/class
scope imports use plain LOAD_ATTR + PUSH_NULL.

* Optimize constant iterable before GET_ITER to LOAD_CONST tuple

Convert BUILD_LIST/SET 0 + LOAD_CONST + LIST_EXTEND/SET_UPDATE + GET_ITER
to just LOAD_CONST (tuple) + GET_ITER, matching CPython's optimization
for constant list/set literals in for-loop iterables.

Also fix is_name_imported to use method mode for function-local imports,
and improve __static_attributes__ accuracy (skip @classmethod/@staticmethod,
handle tuple/list unpacking targets).

* Fix cell variable ordering: parameters first, then alphabetical

CPython orders cell variables with parameter cells first (in
parameter definition order), then non-parameter cells sorted
alphabetically. Previously all cells were sorted alphabetically.

Also add for-loop iterable optimization: constant BUILD_LIST/SET
before GET_ITER is folded to just LOAD_CONST tuple.

* Emit COPY_FREE_VARS before MAKE_CELL matching CPython order

CPython emits COPY_FREE_VARS first, then MAKE_CELL instructions.
Previously RustPython emitted them in reverse order.

* Fix RESUME AfterYield encoding to match CPython 3.14 (value 5)

CPython 3.14 uses RESUME arg=5 for after-yield, not 1.
Also reorder COPY_FREE_VARS before MAKE_CELL and fix cell
variable ordering (parameters first, then alphabetical).

* Address code review feedback from #7481

- Set is_generator flag for generator expressions in scan_comprehension
- Fix posonlyargs priority in collect_static_attributes first param
- Add match statement support to scan_store_attrs
- Fix stale decorator stack comment
- Reorder NOP removal after fold_unary_negative for better collection folding

* Fold constant list/set/tuple literals in compiler

When all elements of a list/set/tuple literal are constants and
there are 3+ elements, fold them into a single constant:
- list: BUILD_LIST 0 + LOAD_CONST (tuple) + LIST_EXTEND 1
- set:  BUILD_SET 0  + LOAD_CONST (tuple) + SET_UPDATE 1
- tuple: LOAD_CONST (tuple)

This matches CPython's compiler optimization and fixes the most
common bytecode difference (92/200 sampled files).

Also add bytecode comparison scripts (dis_dump.py, compare_bytecode.py)
for systematic parity tracking.

* Use BUILD_MAP 0 + MAP_ADD for large dicts (>= 16 pairs)

Match CPython's compiler behavior: dicts with 16+ key-value pairs
use BUILD_MAP 0 followed by MAP_ADD for each pair, instead of
pushing all keys/values on the stack and calling BUILD_MAP N.

* Fix clippy warnings and cargo fmt

* fix surrogate
2026-03-25 22:25:21 +09:00
lif
e1ecb87f32 fix: Flush stdout on shutdown matching CPython behavior (#7503)
* fix: flush stdout on interpreter shutdown matching CPython behavior

When stdout flush fails during shutdown, report the error via
run_unraisable and exit with code 120 (matching CPython's
Py_FinalizeEx). Skip flushing already-closed or None streams.
Stderr flush errors remain silently ignored per CPython behavior.

Fixes #5521

Signed-off-by: majiayu000 <1835304752@qq.com>

* refactor: replace magic number 120 with named constant EXITCODE_FLUSH_FAILURE

Address review feedback on PR #7503: improve readability by extracting
the CPython-compat exit code into a named constant.

Signed-off-by: majiayu000 <1835304752@qq.com>

---------

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-03-25 19:59:16 +09:00
Jeong, YunWon
ea5a6cd9c0 Bytecode parity (#7481)
* Bytecode parity

 Compiler changes:
    - Remove PUSH_NULL from decorator cal
ls, use CALL 0
    - Collect __static_attributes__ from self.xxx = patterns
    - Sort __static_attributes__ alphabetically
    - Move __classdict__ init before __doc__ in class prologue
    - Fold unary negative constants
    - Fold constant list/set literals (3+ elements)
    - Use BUILD_MAP 0 + MAP_ADD for 16+ dict pairs
    - Always run peephole optimizer for s
uperinstructions
    - Emit RETURN_GENERATOR for generator
 functions
    - Add is_generator flag to SymbolTabl
e

* Fix formatting and collapsible_if clippy warnings in compile.rs

* Fix clippy, fold_unary_negative chaining, and generator line tracing

- Replace irrefutable if-let with let for ExceptHandler
- Remove folded UNARY_NEGATIVE instead of replacing with NOP,
  enabling chained negation folding
- Initialize prev_line to def line for generators/coroutines
  to suppress spurious LINE events from preamble instructions
- Remove expectedFailure markers for now-passing tests

* Fix JIT StoreFastStoreFast, format, and remove expectedFailure markers

- Add StoreFastStoreFast handling in JIT instructions
- Fix cargo fmt in frame.rs
- Remove 11 expectedFailure markers for async jump tests in
  test_sys_settrace that now pass

* Fix peephole optimizer: use NOP replacement instead of remove()

Using remove() shifts instruction indices and corrupts subsequent
references, causing "pop stackref but null found" panics at runtime.
Replace folded/combined instructions with NOP instead, which are
cleaned up by the existing remove_nops pass.

* Revert peephole_optimize to use remove() for chaining support

NOP replacement broke chaining of peephole optimizations (e.g.
LOAD_CONST+TO_BOOL then LOAD_CONST+UNARY_NOT for 'not True').
The remove() approach is used by upstream and works correctly here;
fold_unary_negative keeps NOP replacement since it doesn't need chaining.

* Fix StoreFastStoreFast to handle NULL from LoadFastAndClear

StoreFast uses pop_value_opt() to allow NULL values from
LoadFastAndClear in inlined comprehension cleanup paths.
StoreFastStoreFast must do the same, otherwise the peephole
optimizer's fusion of two StoreFast instructions panics when
restoring unbound locals after an inlined comprehension.
2026-03-25 16:10:19 +09:00
Jeong, YunWon
6b5c5a9e92 Handle EINTR retry in os.write() (PEP 475) (#7482)
* Handle EINTR retry in os.write() (PEP 475)

Add EINTR retry loop to os.write(), matching the existing
pattern in os.read() and os.readinto(). Remove the
expectedFailure marker from test_write in _test_eintr.py.

* Add atomic snapshot for dict/dict_keys in extract_elements

Add fast paths for dict and dict_keys types in
extract_elements_with, matching _list_extend() in CPython
Objects/listobject.c. Each branch takes an atomic snapshot
under a single read lock, preventing race conditions from
concurrent dict mutation without the GIL.

Remove expectedFailure from test_thread_safety.
2026-03-25 14:02:38 +09:00
Shahar Naveh
211649d148 Pin setup-node action to a commit hash (#7495)
* Pin `setup-node` action to a commit hash

* Don't use cache for release

* Revert changes of `release.yml`
2026-03-25 12:05:19 +09:00
Shahar Naveh
4ebc3112d9 Cleanup release.yml a bit (#7499)
* Cleanup matrix usage. enables jit on macos

* Pin some actions to commit hash

* Disable node cache

* Inline `CARGO_ARGS`

* Add `stdio` and `host_env` features

* Only upload to pages if not running on fork
2026-03-25 12:04:56 +09:00
Shahar Naveh
6db7910ca4 Pin rust-toolchain action to a commit hash (#7500) 2026-03-25 12:04:34 +09:00
Shahar Naveh
8d3bc4cb54 Add test_concurrent_futures to FLAKY_MP_TESTS (#7502) 2026-03-25 12:04:14 +09:00
Shahar Naveh
20c6505bb9 Resolve shellcheck warning on ci.yaml (#7501) 2026-03-25 12:03:25 +09:00
Shahar Naveh
372280ede4 Pin setup-python action to a commit hash (#7494) 2026-03-25 11:58:39 +09:00
dependabot[bot]
82432be962 Bump lz4_flex from 0.12.1 to 0.13.0 (#7497)
Bumps [lz4_flex](https://github.com/pseitz/lz4_flex) from 0.12.1 to 0.13.0.
- [Release notes](https://github.com/pseitz/lz4_flex/releases)
- [Changelog](https://github.com/PSeitz/lz4_flex/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pseitz/lz4_flex/compare/0.12.1...0.13.0)

---
updated-dependencies:
- dependency-name: lz4_flex
  dependency-version: 0.13.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 17:51:11 +09:00
psyche
40c84f51c8 Expose rustpython_pylib in rustpython (#7498)
Co-authored-by: EtherealPsyche <EtherealPsyche@users.noreply.github.com>
2026-03-24 17:51:01 +09:00
Shahar Naveh
5408627594 Pin setup-python action to a commit hash (#7492) 2026-03-24 12:53:21 +09:00
dependabot[bot]
fb6520e5cc Bump cargo-bins/cargo-binstall from 1.17.7 to 1.17.8 (#7488)
Bumps [cargo-bins/cargo-binstall](https://github.com/cargo-bins/cargo-binstall) from 1.17.7 to 1.17.8.
- [Release notes](https://github.com/cargo-bins/cargo-binstall/releases)
- [Changelog](https://github.com/cargo-bins/cargo-binstall/blob/main/release-plz.toml)
- [Commits](1800853f25...113a77a4ce)

---
updated-dependencies:
- dependency-name: cargo-bins/cargo-binstall
  dependency-version: 1.17.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 12:52:39 +09:00
dependabot[bot]
e9b45a1419 Bump actions/cache from 5.0.3 to 5.0.4 (#7487)
Bumps [actions/cache](https://github.com/actions/cache) from 5.0.3 to 5.0.4.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](cdf6c1fa76...668228422a)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: 5.0.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 12:52:25 +09:00
dependabot[bot]
2acf76bbaf Bump github/gh-aw from 0.58.3 to 0.62.5 (#7486)
Bumps [github/gh-aw](https://github.com/github/gh-aw) from 0.58.3 to 0.62.5.
- [Release notes](https://github.com/github/gh-aw/releases)
- [Changelog](https://github.com/github/gh-aw/blob/main/CHANGELOG.md)
- [Commits](08a903b1fb...48d8fdfddc)

---
updated-dependencies:
- dependency-name: github/gh-aw
  dependency-version: 0.62.5
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 12:52:10 +09:00
dependabot[bot]
dc95db7ae3 Bump libsqlite3-sys from 0.36.0 to 0.37.0 (#7485)
Bumps [libsqlite3-sys](https://github.com/rusqlite/rusqlite) from 0.36.0 to 0.37.0.
- [Release notes](https://github.com/rusqlite/rusqlite/releases)
- [Changelog](https://github.com/rusqlite/rusqlite/blob/master/Changelog.md)
- [Commits](https://github.com/rusqlite/rusqlite/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: libsqlite3-sys
  dependency-version: 0.37.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 12:52:02 +09:00
Shahar Naveh
20ae3ccda2 Set dependabot cooldown (#7490)
* Set dependabot cooldown

* Increase default to 7 days
2026-03-24 12:45:30 +09:00
Shahar Naveh
f1d0fc31c5 Adjust permissions of update-doc-db job (#7496) 2026-03-24 12:45:06 +09:00
Shahar Naveh
56c3a37266 Pin setup-python action to a commit hash (#7491) 2026-03-24 12:44:42 +09:00
Jeong, YunWon
8c016157f4 marshal (#7467)
* CPython-compatible marshal format

Unify marshal to a single CPython-compatible format. No separate
"cpython_marshal" reader — one format for frozen modules, .pyc
files, and the Python-level marshal module.

- ComparisonOperator: `(cmp_index << 5) | mask` matching COMPARE_OP
- MakeFunctionFlag: bit-position matching SET_FUNCTION_ATTRIBUTE
- Exception table varint: big-endian (matching Python/assemble.c)
- Linetable varint: little-endian (unchanged)
- Integer: TYPE_INT (i32) / TYPE_LONG (base-2^15 digits)
- Code objects: CPython field order (argcount, posonlyargcount, ...,
  co_localsplusnames, co_localspluskinds, ..., co_exceptiontable)

- FLAG_REF / TYPE_REF for object deduplication (version >= 3)
- allow_code keyword argument on dumps/loads/dump/load
- Subclass rejection (int/float/complex/tuple/list/dict/set/frozenset)
- Slice serialization (version >= 5)
- Buffer protocol fallback for memoryview/array
- Recursion depth limit (2000) for both reads and writes
- Streaming load (reads one object, seeks file position)
- TYPE_INT64, TYPE_FLOAT (text), TYPE_COMPLEX (text) for compat

serialize_code writes co_localsplusnames/co_localspluskinds from
split varnames/cellvars/freevars. deserialize_code splits them back.
Cell variable DEREF indices are translated between flat (wire) and
cell-relative (internal) representations in both directions.

Replace bitwise trick with match for new ComparisonOperator values.

21 -> 3 expected failures. Remaining: test_bad_reader (IO layer),
test_deterministic_sets (PYTHONHASHSEED), testIntern (string interning).

* Address code review: preserve CO_FAST_HIDDEN, fix varint overflow

- Use original localspluskinds from marshal data instead of
  rebuilding, preserving CO_FAST_HIDDEN and other flags
- Fix write_varint_be to handle values >= 2^30 (add 6th chunk)
- Remove unused build_localspluskinds_from_split

* Add depth guard to deserialize_value_typed

Prevents usize underflow when dict key deserialization path calls
deserialize_value_typed with depth=0 on composite types.
2026-03-23 13:10:51 +09:00
Jeong, YunWon
907ce4d895 Bytecode parity (#7475)
* Emit TO_BOOL before conditional jumps, fix class/module prologue

- Emit TO_BOOL before POP_JUMP_IF_TRUE/FALSE in the general case
  of compile_jump_if (Compare expressions excluded since they
  already produce a bool)
- Module-level __doc__: use STORE_NAME instead of STORE_GLOBAL
- Class body __module__: use LOAD_NAME instead of LOAD_GLOBAL
- Class body: store __firstlineno__ before __doc__

* Emit MAKE_CELL and COPY_FREE_VARS before RESUME

Emit MAKE_CELL for each cell variable and COPY_FREE_VARS N for
free variables at the start of each code object, before RESUME.
These instructions are no-ops in the VM but align the bytecode
with CPython 3.14's output.

* Emit __static_attributes__ at end of class bodies

Store a tuple of attribute names (currently always empty) as
__static_attributes__ in the class namespace, matching CPython
3.14's class body epilogue. Attribute name collection from
self.xxx accesses is a follow-up task.

* Remove expectedFailure from DictProxyTests iter tests

test_iter_keys, test_iter_values, test_iter_items now pass
because class bodies emit __static_attributes__ and
__firstlineno__, matching the expected dict key set.

* Use 1-based stack indexing for LIST_EXTEND, SET_UPDATE, etc.

Switch LIST_APPEND, LIST_EXTEND, SET_ADD, SET_UPDATE, MAP_ADD
from 0-based to 1-based stack depth argument, matching CPython's
PEEK(oparg) convention. Adjust the VM to subtract 1 before
calling nth_value.

* Use plain LOAD_ATTR + PUSH_NULL for calls on imported names

When the call target is an attribute of an imported name (e.g.,
logging.getLogger()), use plain LOAD_ATTR (method_flag=0) with
a separate PUSH_NULL instead of method-mode LOAD_ATTR. This
matches CPython 3.14's behavior which avoids the method call
optimization for module attribute access.

* Duplicate return-None epilogue for fall-through blocks

When the last block in a code object is exactly LOAD_CONST None +
RETURN_VALUE (the implicit return), duplicate these instructions
into blocks that would otherwise fall through to it. This matches
CPython 3.14's behavior of giving each code path its own explicit
return instruction.

* Run cargo fmt on ir.rs

* Remove expectedFailure from test_intrinsic_1 in test_dis

* Emit TO_BOOL before conditional jumps for all expressions including Compare

* Add __classdict__ cell for classes with function definitions

Set needs_classdict=true for class scopes that contain function
definitions (def/async def), matching CPython 3.14's behavior for
PEP 649 deferred annotation support. Also restore the Compare
expression check in compile_jump_if to skip TO_BOOL for comparison
operations.

* Emit __classdictcell__ store in class body epilogue

Store the __classdict__ cell reference as __classdictcell__ in
the class namespace when the class has __classdict__ as a cell
variable. Uses LOAD_DEREF (RustPython separates cell vars from
fast locals unlike CPython's unified array).

* Always run DCE to remove dead code after terminal instructions

Run basic dead code elimination (truncating instructions after
RETURN_VALUE/RAISE/JUMP within blocks) at all optimization
levels, not just optimize > 0. CPython always removes this dead
code during assembly.

* Restrict LOAD_ATTR plain mode to module/class scope imports

Only use plain LOAD_ATTR + PUSH_NULL for imports at module or
class scope. Function-local imports use method call mode LOAD_ATTR,
matching CPython 3.14's behavior.

* Eliminate unreachable blocks after jump normalization

Split DCE into two phases: (1) within-block truncation after
terminal instructions (always runs), (2) whole-block elimination
for blocks only reachable via fall-through from terminal blocks
(runs after normalize_jumps when dead jump instructions exist).

* Fold BUILD_TUPLE 0 into LOAD_CONST empty tuple

Convert BUILD_TUPLE with size 0 to LOAD_CONST () during constant
folding, matching CPython's optimization for empty tuple literals.

* Handle __classcell__ and __classdictcell__ in type.__new__

- Remove __classcell__ from class dict after setting the cell value
- Add __classdictcell__ handling: set cell to class namespace dict,
  then remove from class dict
- Register __classdictcell__ identifier
- Use LoadClosure instead of LoadDeref for __classdictcell__ emission
- Reorder MakeFunctionFlag bits to match CPython
- Run ruff format on scripts

* Revert __classdict__ cell and __classdictcell__ changes

The __classdict__ cell addition (for classes with function defs)
and __classdictcell__ store caused cell initialization failures
in importlib. These require deeper VM changes to properly support
the cell variable lifecycle. Reverted for stability.

* Fix unreachable block elimination with fixpoint reachability

Use fixpoint iteration to properly determine block reachability:
only mark jump targets of already-reachable blocks, preventing
orphaned blocks from falsely marking their targets as reachable.
Also add a final DCE pass after assembly NOP removal to catch
dead code created by normalize_jumps.

* Check enclosing scopes for IMPORTED flag in LOAD_ATTR mode

When deciding whether to use plain LOAD_ATTR for attribute calls,
check if the name is imported in any enclosing scope (not just
the current scope). This handles the common pattern where a module
is imported at module level but used inside functions.

* Add __classdict__ cell for classes with function definitions

Set needs_classdict=true when a class scope contains function
definitions (def/async def), matching CPython 3.14 which always
creates a __classdict__ cell for PEP 649 support in such classes.

* Store __classdictcell__ in class body epilogue

Store the __classdict__ cell reference as __classdictcell__ in
the class namespace using LoadClosure (which loads the cell
object itself, not the value inside). This matches CPython 3.14's
class body epilogue.

* Fix clippy collapsible_if warnings and cargo fmt

* Revert __classdict__ and __classdictcell__ changes (cause import failures)

* Revert type.__new__ __classcell__ removal and __classdictcell__ handling

Revert the class cell cleanup changes from e6975f973 that cause
import failures when frozen module bytecode is stale. The original
behavior (not removing __classcell__ from class dict) is restored.

* Re-add __classdict__ cell and __classdictcell__ store

Restore the __classdict__ cell for classes with function
definitions and __classdictcell__ store in class body epilogue.
Previous failure was caused by stale .pyc cache files containing
bytecode from an intermediate MakeFunctionFlag reorder attempt,
not by these changes themselves.

* Reorder MakeFunctionFlag to match CPython's SET_FUNCTION_ATTRIBUTE

Reorder discriminants: Defaults=0, KwOnlyDefaults=1, Annotations=2,
Closure=3, Annotate=4, TypeParams=5. This aligns the oparg values
with CPython 3.14's convention.

Note: after this change, stale .pyc cache files must be deleted
(find . -name '*.pyc' -delete) to avoid bytecode mismatch errors.

* Use CPython-compatible power-of-two encoding for SET_FUNCTION_ATTRIBUTE

Override From/TryFrom for MakeFunctionFlag to use power-of-two
values (1,2,4,8,16,32) matching CPython's SET_FUNCTION_ATTRIBUTE
oparg encoding, instead of sequential discriminants (0,1,2,3,4,5).

* Remove expectedFailure from test_elim_jump_after_return1 and test_no_jump_over_return_out_of_finally_block

* Remove __classcell__ and __classdictcell__ from class dict in type.__new__

* Remove expectedFailure from test___classcell___expected_behaviour, cargo fmt

* Handle MakeCell and CopyFreeVars as no-ops in JIT

These prologue instructions are handled at frame creation time
by the VM. The JIT operates on already-initialized frames, so
these can be safely skipped during compilation.

* Remove expectedFailure from test_load_fast_known_simple

* Restore expectedFailure for test_load_fast_known_simple

The test expects LOAD_FAST_BORROW_LOAD_FAST_BORROW superinstruction
which RustPython does not emit yet.
2026-03-23 11:31:30 +09:00
Christian Legnitto
2180f535d8 Fix sub_table ordering for nested inlined comprehensions (PEP 709) (#7480)
When an inlined comprehension's first iterator expression contains
nested scopes (such as a lambda), those scopes' sub_tables appear at the
current position in the parent's sub_table list. The previous code
spliced the comprehension's own child sub_tables (e.g. inner inlined
comprehensions) into that same position before compiling the iterator,
which shifted the iterator's sub_tables to wrong indices.

Move the splice after the first iterator is compiled so its sub_tables
are consumed at their original positions.

Fixes nested list comprehensions like:
```python
    [[x for _, x in g] for _, g in itertools.groupby(..., lambda x: ...)]
```

Disclosure: I used AI to develop the patch though I was heavily
involved.
2026-03-22 17:23:03 +09:00
Lee Dogeon
3c62b5679f Implement bytearray.__str__ && bytes.__str__ (#7477)
* Implement bytearray.__str__

* Implement bytes.__str__

* Turn __str__ method into slot
2026-03-22 12:50:31 +09:00
dependabot[bot]
dfcb07cd93 Bump rustls-webpki from 0.103.9 to 0.103.10 (#7479)
Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.103.9 to 0.103.10.
- [Release notes](https://github.com/rustls/webpki/releases)
- [Commits](https://github.com/rustls/webpki/compare/v/0.103.9...v/0.103.10)

---
updated-dependencies:
- dependency-name: rustls-webpki
  dependency-version: 0.103.10
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-21 22:49:25 +09:00
Jeong, YunWon
2d676e7f4d A few windows fix (#7458)
* Disallow instantiation of sys.getwindowsversion type

Add slot_new to PyWindowsVersion that raises TypeError,
matching sys.flags behavior.

* Remove incorrect WSAHOS errno constant

WSAHOS was hardcoded as an alias for WSAHOST_NOT_FOUND, but
CPython guards it with #ifdef WSAHOS which doesn't exist in
modern Windows SDK headers.

* Fix mmap resize to raise OSError instead of SystemError

* Fix CreateProcess with empty environment on Windows

Empty env dict produced a single null terminator, but
CreateProcessW requires a double null for a valid empty
environment block.

* Revert mmap resize error to SystemError and fix errno.rs formatting

mmap resize raises SystemError (not OSError) when mremap is unavailable,
matching CPython behavior. test_mmap catches SystemError to skip unsupported
resize operations.

* Fix named mmap resize to raise OSError and unmark test_sleep expectedFailure

Named mmap resize on Windows should raise OSError (not SystemError).
Remove expectedFailure mark from TimeEINTRTest.test_sleep as it now passes.

* Use expectedFailureIf for TimeEINTRTest.test_sleep on Linux

test_sleep passes on macOS but fails on Linux due to timing.

* Remove expectedFailure for TimeEINTRTest.test_sleep

test_sleep now passes on all platforms.
2026-03-21 22:49:09 +09:00
Jeong, YunWon
3e9f825e1d Enable PEP 709 inlined comprehensions (#7412)
* Enable PEP 709 inlined comprehensions for function-like scopes

Activate the existing compile_inlined_comprehension() implementation
by fixing 6 bugs that prevented it from working:

- LoadFastAndClear: push NULL (not None) when slot is empty so
  StoreFast can restore empty state after comprehension
- StoreFast: accept NULL from stack for the restore path
- sub_tables.remove(0) replaced with next_sub_table cursor to
  match the pattern used elsewhere in the compiler
- in_inlined_comp flag moved from non-inlined to inlined path
- is_inlined_comprehension_context() now checks comp_inlined flag
  and restricts inlining to function-like scopes
- comp_inlined set only when parent scope uses fastlocals

Symbol table analysis handles conflict detection:
- Nested scopes in comprehension → skip inlining
- Bound name conflicts with parent symbol → skip inlining
- Cross-comprehension reference conflicts → skip inlining
- Splice comprehension sub_tables into parent for nested scope tracking

* Add localspluskinds, unify DEREF to localsplus index

- Add CO_FAST_LOCAL/CELL/FREE/HIDDEN constants and
  localspluskinds field to CodeObject for per-slot metadata
- Change DEREF instruction opargs from cell-relative indices
  (NameIdx) to localsplus absolute indices (oparg::VarNum)
- Add fixup_deref_opargs pass in ir.rs to convert cell-relative
  indices to localsplus indices after finalization
- Replace get_cell_name with get_localsplus_name in
  InstrDisplayContext trait
- Update VM cell_ref/get_cell_contents/set_cell_contents to use
  localsplus indices directly (no nlocals offset)
- Update function.rs cell2arg, super.rs __class__ lookup with
  explicit nlocals offsets

* Fix clippy warnings, formatting, restore _opcode_metadata.py

Fix cast_possible_truncation, nonminimal_bool, collapsible_if,
manual_contains clippy lints. Restore _opcode_metadata.py to
upstream/main version (3.14 aligned).

Pre-copy closure cells in Frame::new for coroutine locals().
Handle raw values in merged cell slots during inlined comps.
Exclude async comprehensions from inlining path.

* Exclude async/await comprehensions from PEP 709 inlining in symboltable

Async comprehensions and comprehensions with await in the element
expression need their own coroutine scope and cannot be inlined.
The symboltable builder was not checking these conditions, causing
incorrect symbol scope resolution when an async comprehension was
nested inside an inlined comprehension (e.g. [[x async for x in g]
for j in items]).
2026-03-21 22:48:35 +09:00
dependabot[bot]
4abe4c5bf0 Bump aws-lc-fips-sys from 0.13.12 to 0.13.13 (#7478)
Bumps [aws-lc-fips-sys](https://github.com/aws/aws-lc-rs) from 0.13.12 to 0.13.13.
- [Release notes](https://github.com/aws/aws-lc-rs/releases)
- [Commits](https://github.com/aws/aws-lc-rs/compare/aws-lc-fips-sys/v0.13.12...aws-lc-fips-sys/v0.13.13)

---
updated-dependencies:
- dependency-name: aws-lc-fips-sys
  dependency-version: 0.13.13
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-21 09:46:15 +09:00
Lee Dogeon
a1203ae207 Improve CPython compatibility related with PyBoundMethod (#7476)
* Add GetDescriptor for PyBoundMethod (return self)

CPython's method_descr_get always returns the bound method unchanged.
This preserves the original binding when __get__ is called on an
already-bound method (e.g. a.meth.__get__(b, B) still returns a).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add constructor validation for PyBoundMethod

Reject non-callable functions and None instances, matching CPython's
method_new which checks PyCallable_Check(func) and instance != Py_None.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix PyBoundMethod __reduce__ to propagate errors

Previously swallowed errors from get_attr with .ok(), silently
returning None. Now propagates errors matching CPython's method_reduce.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:33:40 +09:00
dependabot[bot]
5b6a479a1d Bump lexopt from 0.3.1 to 0.3.2 (#7470)
Bumps [lexopt](https://github.com/blyxxyz/lexopt) from 0.3.1 to 0.3.2.
- [Release notes](https://github.com/blyxxyz/lexopt/releases)
- [Changelog](https://github.com/blyxxyz/lexopt/blob/master/CHANGELOG.md)
- [Commits](https://github.com/blyxxyz/lexopt/compare/v0.3.1...v0.3.2)

---
updated-dependencies:
- dependency-name: lexopt
  dependency-version: 0.3.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-20 22:49:41 +09:00
125 changed files with 9511 additions and 2901 deletions

View File

@@ -109,6 +109,7 @@ lineiterator
linetable
loadfast
localsplus
localspluskinds
Lshift
lsprof
MAXBLOCKS

View File

@@ -70,6 +70,7 @@
"lossily",
"mcache",
"oparg",
"opargs",
"pyc",
"significand",
"summands",

View File

@@ -5,6 +5,11 @@ updates:
directory: /
schedule:
interval: weekly
cooldown:
default-days: 7
semver-major-days: 30
semver-minor-days: 7
semver-patch-days: 3
groups:
criterion:
patterns:
@@ -143,7 +148,20 @@ updates:
directory: /
schedule:
interval: weekly
cooldown:
default-days: 7
- package-ecosystem: npm
directory: /
schedule:
interval: weekly
cooldown:
default-days: 7
semver-major-days: 30
semver-minor-days: 7
semver-patch-days: 3
- package-ecosystem: pre-commit
directory: /
schedule:
interval: weekly
cooldown:
default-days: 7

View File

@@ -8,11 +8,14 @@ on:
name: CI
permissions:
contents: read
# Cancel previous workflows if they are the same workflow on same ref (branch/tags)
# with the same event (push/pull_request) even they are in progress.
# This setting will help reduce the number of duplicated workflows.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
group: ${{ github.workflow }}-${{ github.ref_name }}-${{ github.event.pull_request.number || github.sha }}
cancel-in-progress: true
env:
@@ -27,6 +30,8 @@ env:
PYTHON_VERSION: "3.14.3"
X86_64_PC_WINDOWS_MSVC_OPENSSL_LIB_DIR: C:\Program Files\OpenSSL\lib\VC\x64\MD
X86_64_PC_WINDOWS_MSVC_OPENSSL_INCLUDE_DIR: C:\Program Files\OpenSSL\include
CARGO_INCREMENTAL: 0
CARGO_TERM_COLOR: always
jobs:
rust_tests:
@@ -65,6 +70,9 @@ jobs:
- name: check compilation without threading
run: cargo check ${{ env.CARGO_ARGS }}
- run: cargo doc --locked
if: runner.os == 'Linux'
- name: check compilation without host_env (sandbox mode)
run: |
cargo check -p rustpython-vm --no-default-features --features compiler
@@ -104,41 +112,39 @@ jobs:
cargo_check:
if: ${{ !contains(github.event.pull_request.labels.*.name, 'skip:ci') }}
name: Ensure compilation on various targets
name: cargo check
runs-on: ${{ matrix.os }}
strategy:
matrix:
include:
- os: ubuntu-latest
targets:
- aarch64-linux-android
- i686-unknown-linux-gnu
- i686-unknown-linux-musl
- wasm32-wasip2
- x86_64-unknown-freebsd
target: aarch64-linux-android
- os: ubuntu-latest
target: i686-unknown-linux-gnu
dependencies:
gcc-multilib: true
- os: ubuntu-latest
target: i686-unknown-linux-musl
dependencies:
musl-tools: true
- os: ubuntu-latest
targets:
- aarch64-unknown-linux-gnu
target: wasm32-wasip2
- os: ubuntu-latest
target: x86_64-unknown-freebsd
- os: ubuntu-latest
target: aarch64-unknown-linux-gnu
dependencies:
gcc-aarch64-linux-gnu: true # conflict with `gcc-multilib`
gcc-aarch64-linux-gnu: true
- os: macos-latest
targets:
- aarch64-apple-ios
- x86_64-apple-darwin
target: aarch64-apple-ios
- os: macos-latest
target: x86_64-apple-darwin
fail-fast: false
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: Swatinem/rust-cache@c19371144df3bb44fab255c43d04cbc2ab54d1c4 # v2.9.1
with:
prefix-key: v0-rust-${{ join(matrix.targets, '-') }}
save-if: ${{ github.ref == 'refs/heads/main' }}
- name: Install dependencies
uses: ./.github/actions/install-linux-deps
# zizmor has an issue with dynamic `with`
@@ -148,12 +154,27 @@ jobs:
musl-tools: ${{ matrix.dependencies.musl-tools || false }}
gcc-aarch64-linux-gnu: ${{ matrix.dependencies.gcc-aarch64-linux-gnu || false }}
- name: Restore cache
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
if: ${{ github.ref != 'refs/heads/main' }} # Never restore on main
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
# key won't match, will rely on restore-keys
key: cargo-check-${{ runner.os }}-${{ matrix.target }}
restore-keys: |
cargo-check-${{ runner.os }}-${{ matrix.target }}-
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ join(matrix.targets, ',') }}
target: ${{ matrix.target }}
- name: Setup Android NDK
if: ${{ contains(matrix.targets, 'aarch64-linux-android') }}
if: ${{ matrix.target == 'aarch64-linux-android' }}
id: setup-ndk
uses: nttld/setup-ndk@v1
with:
@@ -169,18 +190,24 @@ jobs:
# args: --ignore-rust-version
- name: Check compilation
run: |
for target in ${{ join(matrix.targets, ' ') }}
do
echo "::group::${target}"
cargo check --target $target ${{ env.CARGO_ARGS_NO_SSL }}
echo "::endgroup::"
done
run: cargo check --target "${{ matrix.target }}" ${{ env.CARGO_ARGS_NO_SSL }}
env:
CC_aarch64_linux_android: ${{ steps.setup-ndk.outputs.ndk-path }}/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android24-clang
AR_aarch64_linux_android: ${{ steps.setup-ndk.outputs.ndk-path }}/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-ar
CARGO_TARGET_AARCH64_LINUX_ANDROID_LINKER: ${{ steps.setup-ndk.outputs.ndk-path }}/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android24-clang
- name: Save cache
if: ${{ github.ref == 'refs/heads/main' }} # only save on main
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: cargo-check-${{ runner.os }}-${{ matrix.target }}-${{ hashFiles('**/Cargo.toml') }}-${{ hashFiles('Cargo.lock') }}-${{ github.sha }}
snippets_cpython:
if: ${{ !contains(github.event.pull_request.labels.*.name, 'skip:ci') }}
env:
@@ -188,6 +215,7 @@ jobs:
# Tests that can be flaky when running with multiple processes `-j 2`. We will use `-j 1` for these.
FLAKY_MP_TESTS: >-
test_class
test_concurrent_futures
test_eintr
test_multiprocessing_fork
test_multiprocessing_forkserver
@@ -230,7 +258,7 @@ jobs:
with:
save-if: ${{ github.ref == 'refs/heads/main' }}
- uses: actions/setup-python@v6.2.0
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: ${{ env.PYTHON_VERSION }}
@@ -251,7 +279,7 @@ jobs:
shell: bash
run: |
cores=$(python -c 'print(__import__("os").process_cpu_count())')
echo "cores=${cores}" >> $GITHUB_OUTPUT
echo "cores=${cores}" >> "$GITHUB_OUTPUT"
- name: Run CPython tests
run: |
@@ -270,28 +298,32 @@ jobs:
- name: run cpython tests to check if env polluters have stopped polluting
shell: bash
run: |
for thing in ${{ join(matrix.env_polluting_tests, ' ') }}; do
IFS=' ' read -r -a target_array <<< "$TARGETS"
for thing in "${target_array[@]}"; do
for i in $(seq 1 10); do
set +e
target/release/rustpython -m test -j 1 --slowest --fail-env-changed --timeout 600 -v ${thing}
target/release/rustpython -m test -j 1 --slowest --fail-env-changed --timeout 600 -v "${thing}"
exit_code=$?
set -e
if [ ${exit_code} -eq 3 ]; then
if [ "${exit_code}" -eq 3 ]; then
echo "Test ${thing} polluted the environment on attempt ${i}."
break
fi
done
if [ ${exit_code} -ne 3 ]; then
if [ "${exit_code}" -ne 3 ]; then
echo "Test ${thing} is no longer polluting the environment after ${i} attempts!"
echo "Please remove ${thing} from matrix.env_polluting_tests in '.github/workflows/ci.yaml'."
echo "Please also remove the skip decorators that include the word 'POLLUTERS' in ${thing}."
if [ ${exit_code} -ne 0 ]; then
if [ "${exit_code}" -ne 0 ]; then
echo "Test ${thing} failed with exit code ${exit_code}."
echo "Please investigate which test item in ${thing} is failing and either mark it as an expected failure or a skip."
fi
exit 1
fi
done
env:
TARGETS: ${{ join(matrix.env_polluting_tests, ' ') }}
timeout-minutes: 15
- if: runner.os != 'Windows'
@@ -317,63 +349,67 @@ jobs:
run: python -I scripts/whats_left.py ${{ env.CARGO_ARGS }} --features jit
lint:
name: Lint Rust & Python code
name: Lint
runs-on: ubuntu-latest
permissions:
contents: read
checks: write
pull-requests: write
security-events: write # for zizmor
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: actions/setup-python@v6.2.0
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Check for redundant test patches
run: python scripts/check_redundant_patches.py
- uses: dtolnay/rust-toolchain@stable
with:
components: clippy
components: rustfmt
- name: run clippy on wasm
run: cargo clippy --manifest-path=crates/wasm/Cargo.toml -- -Dwarnings
- uses: cargo-bins/cargo-binstall@113a77a4ce971c41332f2129c3d995df993cf746 # v1.17.8
- name: Ensure docs generate no warnings
run: cargo doc --locked
- name: Ensure Lib/_opcode_metadata is updated
- name: cargo shear
run: |
python scripts/generate_opcode_metadata.py
if [ -n "$(git status --porcelain)" ]; then
exit 1
fi
cargo binstall --no-confirm cargo-shear
cargo shear
- name: Install ruff
uses: astral-sh/ruff-action@4919ec5cf1f49eff0871dbcea0da843445b837e6 # v3.6.1
- name: actionlint
uses: reviewdog/action-actionlint@0d952c597ef8459f634d7145b0b044a9699e5e43 # v1.71.0
- name: zizmor
uses: zizmorcore/zizmor-action@71321a20a9ded102f6e9ce5718a2fcec2c4f70d8 # v0.5.2
- name: restore prek cache
if: ${{ github.ref != 'refs/heads/main' }} # never restore on main
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
version: "0.15.5"
args: "--version"
key: prek-${{ hashFiles('.pre-commit-config.yaml') }}
path: ~/.cache/prek
- run: ruff check --diff
- run: ruff format --check
- name: install prettier
run: |
yarn global add prettier
yarn global bin >> "$GITHUB_PATH"
- name: check wasm code with prettier
# prettier doesn't handle ignore files very well: https://github.com/prettier/prettier/issues/8506
run: cd wasm && git ls-files -z | xargs -0 prettier --check -u
# Keep cspell check as the last step. This is optional test.
- name: install extra dictionaries
run: npm install @cspell/dict-en_us @cspell/dict-cpp @cspell/dict-python @cspell/dict-rust @cspell/dict-win32 @cspell/dict-shell
- name: spell checker
uses: streetsidesoftware/cspell-action@v8
- name: prek
id: prek
uses: j178/prek-action@53276d8b0d10f8b6672aa85b4588c6921d0370cc # v2.0.1
with:
files: "**/*.rs"
incremental_files_only: true
cache: false
show-verbose-logs: false
continue-on-error: true
- name: save prek cache
if: ${{ github.ref == 'refs/heads/main' }} # only save on main
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
key: prek-${{ hashFiles('.pre-commit-config.yaml') }}
path: ~/.cache/prek
- name: reviewdog
uses: reviewdog/action-suggester@aa38384ceb608d00f84b4690cacc83a5aba307ff # 1.24.0
with:
level: warning
fail_level: error
cleanup: false
miri:
if: ${{ !contains(github.event.pull_request.labels.*.name, 'skip:ci') }}
@@ -414,11 +450,16 @@ jobs:
persist-credentials: false
- uses: dtolnay/rust-toolchain@stable
with:
components: clippy
- uses: Swatinem/rust-cache@c19371144df3bb44fab255c43d04cbc2ab54d1c4 # v2.9.1
with:
save-if: ${{ github.ref == 'refs/heads/main' }}
- name: cargo clippy
run: cargo clippy --manifest-path=crates/wasm/Cargo.toml -- -Dwarnings
- name: install wasm-pack
run: curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
- name: install geckodriver
@@ -426,12 +467,14 @@ jobs:
wget https://github.com/mozilla/geckodriver/releases/download/v0.36.0/geckodriver-v0.36.0-linux64.tar.gz
mkdir geckodriver
tar -xzf geckodriver-v0.36.0-linux64.tar.gz -C geckodriver
- uses: actions/setup-python@v6.2.0
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: ${{ env.PYTHON_VERSION }}
- run: python -m pip install -r requirements.txt
working-directory: ./wasm/tests
- uses: actions/setup-node@v6
- uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0
with:
cache: "npm"
cache-dependency-path: "wasm/demo/package-lock.json"
@@ -502,32 +545,6 @@ jobs:
- name: build rustpython
run: cargo build --release --target wasm32-wasip1 --features freeze-stdlib,stdlib --verbose
- name: run snippets
run: wasmer run --dir $(pwd) target/wasm32-wasip1/release/rustpython.wasm -- "$(pwd)/extra_tests/snippets/stdlib_random.py"
run: wasmer run --dir "$(pwd)" target/wasm32-wasip1/release/rustpython.wasm -- "$(pwd)/extra_tests/snippets/stdlib_random.py"
- name: run cpython unittest
run: wasmer run --dir $(pwd) target/wasm32-wasip1/release/rustpython.wasm -- "$(pwd)/Lib/test/test_int.py"
cargo-shear:
name: cargo shear
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: cargo-bins/cargo-binstall@1800853f2578f8c34492ec76154caef8e163fbca # v1.17.7
- run: cargo binstall --no-confirm cargo-shear
- run: cargo shear
security-lint:
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Run zizmor
uses: zizmorcore/zizmor-action@71321a20a9ded102f6e9ce5718a2fcec2c4f70d8 # v0.5.2
run: wasmer run --dir "$(pwd)" target/wasm32-wasip1/release/rustpython.wasm -- "$(pwd)/Lib/test/test_int.py"

View File

@@ -18,4 +18,6 @@ jobs:
steps:
# Using REST API and not `gh issue edit`. https://github.com/cli/cli/issues/6235#issuecomment-1243487651
- run: |
curl -H "Authorization: token ${{ github.token }}" -d '{"assignees": ["${{ github.event.comment.user.login }}"]}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/assignees
curl -H "Authorization: token ${{ github.token }}" -d '{"assignees": ["${{ env.USER }}"]}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/assignees
env:
USER: ${{ github.event.comment.user.login }}

View File

@@ -12,7 +12,7 @@ on:
name: Periodic checks/tasks
env:
CARGO_ARGS: --no-default-features --features stdlib,importlib,encodings,ssl-rustls,jit
CARGO_ARGS: --no-default-features --features stdlib,importlib,stdio,encodings,ssl-rustls,jit,host_env
PYTHON_VERSION: "3.14.3"
jobs:
@@ -35,7 +35,7 @@ jobs:
python-version: ${{ env.PYTHON_VERSION }}
- run: sudo apt-get update && sudo apt-get -y install lcov
- name: Run cargo-llvm-cov with Rust tests.
run: cargo llvm-cov --no-report --workspace --exclude rustpython_wasm --exclude rustpython-compiler-source --exclude rustpython-venvlauncher --verbose --no-default-features --features stdlib,importlib,encodings,ssl-rustls,jit
run: cargo llvm-cov --no-report --workspace --exclude rustpython_wasm --exclude rustpython-compiler-source --exclude rustpython-venvlauncher --verbose --no-default-features --features stdlib,importlib,stdio,encodings,ssl-rustls,jit,host_env
- name: Run cargo-llvm-cov with Python snippets.
run: python scripts/cargo-llvm-cov.py
continue-on-error: true
@@ -48,7 +48,7 @@ jobs:
if: ${{ github.event_name != 'pull_request' }}
uses: codecov/codecov-action@v5
with:
file: ./codecov.lcov
files: ./codecov.lcov
testdata:
name: Collect regression test data
@@ -170,12 +170,12 @@ jobs:
- name: restructure generated files
run: |
cd ./target/criterion/reports
find -type d -name cpython | xargs rm -rf
find -type d -name rustpython | xargs rm -rf
find -mindepth 2 -maxdepth 2 -name violin.svg | xargs rm -rf
find -type f -not -name violin.svg | xargs rm -rf
for file in $(find -type f -name violin.svg); do mv $file $(echo $file | sed -E "s_\./([^/]+)/([^/]+)/violin\.svg_./\1/\2.svg_"); done
find -mindepth 2 -maxdepth 2 -type d | xargs rm -rf
find . -type d -name cpython -print0 | xargs -0 rm -rf
find . -type d -name rustpython -print0 | xargs -0 rm -rf
find . -mindepth 2 -maxdepth 2 -name violin.svg -print0 | xargs -0 rm -rf
find . -type f -not -name violin.svg -print0 | xargs -0 rm -rf
find . -type f -name violin.svg -exec sh -c 'for file; do mv "$file" "$(echo "$file" | sed -E "s_\./([^/]+)/([^/]+)/violin\.svg_./\1/\2.svg_")"; done' _ {} +
find . -mindepth 2 -maxdepth 2 -type d -print0 | xargs -0 rm -rf
cd ..
mv reports/* .
rmdir reports

View File

@@ -7,7 +7,7 @@ on:
- "Lib/**"
concurrency:
group: lib-deps-${{ github.event.pull_request.number }}
group: ${{ github.workflow }}-${{ github.ref_name }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
env:
@@ -74,7 +74,7 @@ jobs:
- name: Setup Python
if: steps.changed-files.outputs.modules != ''
uses: actions/setup-python@v6.2.0
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "${{ env.PYTHON_VERSION }}"
@@ -83,22 +83,15 @@ jobs:
id: deps-check
run: |
# Run deps for all modules at once
python scripts/update_lib deps ${{ steps.changed-files.outputs.modules }} --depth 2 > /tmp/deps_output.txt 2>&1 || true
# Read output for GitHub Actions
echo "deps_output<<EOF" >> $GITHUB_OUTPUT
cat /tmp/deps_output.txt >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
# Check if there's any meaningful output
if [ -s /tmp/deps_output.txt ]; then
echo "has_output=true" >> $GITHUB_OUTPUT
else
echo "has_output=false" >> $GITHUB_OUTPUT
fi
echo "deps_output<<EOF" >> "$GITHUB_OUTPUT"
output=$(python scripts/update_lib deps "${MODULES}" --depth 2 2>&1 || true)
echo "$output" >> "$GITHUB_OUTPUT"
echo "EOF" >> "$GITHUB_OUTPUT"
env:
MODULES: ${{ steps.changed-files.outputs.modules }}
- name: Post comment
if: steps.deps-check.outputs.has_output == 'true'
if: steps.deps-check.outputs.deps_output != ''
uses: marocchino/sticky-pull-request-comment@v3
with:
header: lib-deps-check

View File

@@ -1,74 +0,0 @@
name: Format Check
# This workflow triggers when a PR is opened/updated
# Posts inline suggestion comments instead of auto-committing
on:
pull_request:
types: [opened, synchronize, reopened]
branches:
- main
- release
concurrency:
group: format-check-${{ github.event.pull_request.number }}
cancel-in-progress: true
env:
PYTHON_VERSION: "3.14.3"
jobs:
format_check:
permissions:
contents: read
pull-requests: write
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: reviewdog/action-actionlint@0d952c597ef8459f634d7145b0b044a9699e5e43 # v1.71.0
- name: Setup Rust
uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt
- name: Run cargo fmt
run: cargo fmt --all
- name: Install ruff
uses: astral-sh/ruff-action@4919ec5cf1f49eff0871dbcea0da843445b837e6 # v3.6.1
with:
version: "0.15.4"
args: "--version"
- name: Run ruff format
run: ruff format
- name: Run ruff check import sorting
run: ruff check --select I --fix
- uses: actions/setup-python@v6.2.0
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Run generate_opcode_metadata.py
run: python scripts/generate_opcode_metadata.py
- name: Check for formatting changes
run: |
if ! git diff --exit-code; then
echo "::error::Formatting changes detected. Please run 'cargo fmt --all', 'ruff format', and 'ruff check --select I --fix' locally."
exit 1
fi
- name: Post formatting suggestions
if: failure()
uses: reviewdog/action-suggester@v1
with:
tool_name: auto-format
github_token: ${{ secrets.GITHUB_TOKEN }}
level: warning
filter_mode: diff_context

View File

@@ -12,44 +12,40 @@ on:
required: false
default: true
permissions:
contents: write
env:
CARGO_ARGS: --no-default-features --features stdlib,importlib,encodings,sqlite,ssl
X86_64_PC_WINDOWS_MSVC_OPENSSL_LIB_DIR: C:\Program Files\OpenSSL\lib\VC\x64\MD
X86_64_PC_WINDOWS_MSVC_OPENSSL_INCLUDE_DIR: C:\Program Files\OpenSSL\include
jobs:
build:
runs-on: ${{ matrix.platform.runner }}
runs-on: ${{ matrix.os }}
# Disable this scheduled job when running on a fork.
if: ${{ github.repository == 'RustPython/RustPython' || github.event_name != 'schedule' }}
strategy:
matrix:
platform:
- runner: ubuntu-latest
include:
- os: ubuntu-latest
target: x86_64-unknown-linux-gnu
# - runner: ubuntu-latest
# target: i686-unknown-linux-gnu
# - runner: ubuntu-latest
# target: aarch64-unknown-linux-gnu
# - runner: ubuntu-latest
# target: armv7-unknown-linux-gnueabi
# - runner: ubuntu-latest
# target: s390x-unknown-linux-gnu
# - runner: ubuntu-latest
# target: powerpc64le-unknown-linux-gnu
- runner: macos-latest
- os: macos-latest
target: aarch64-apple-darwin
# - runner: macos-latest
# target: x86_64-apple-darwin
- runner: windows-2025
- os: windows-2025
target: x86_64-pc-windows-msvc
# - runner: windows-2025
# target: i686-pc-windows-msvc
# - runner: windows-2025
# target: aarch64-pc-windows-msvc
# - os: ubuntu-latest
# target: i686-unknown-linux-gnu
# - os: ubuntu-latest
# target: aarch64-unknown-linux-gnu
# - os: ubuntu-latest
# target: armv7-unknown-linux-gnueabi
# - os: ubuntu-latest
# target: s390x-unknown-linux-gnu
# - os: ubuntu-latest
# target: powerpc64le-unknown-linux-gnu
# - os: macos-latest
# target: x86_64-apple-darwin
# - os: windows-2025
# target: i686-pc-windows-msvc
# - os: windows-2025
# target: aarch64-pc-windows-msvc
fail-fast: false
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -57,34 +53,32 @@ jobs:
persist-credentials: false
- uses: dtolnay/rust-toolchain@stable
- uses: cargo-bins/cargo-binstall@main
with:
target: ${{ matrix.target }}
- name: Set up Environment
shell: bash
run: rustup target add ${{ matrix.platform.target }}
- name: Set up MacOS Environment
run: brew install autoconf automake libtool
if: runner.os == 'macOS'
- name: Install macOS dependencies
uses: ./.github/actions/install-macos-deps
with:
autoconf: true
automake: true
libtool: true
- name: Build RustPython
run: cargo build --release --target=${{ matrix.platform.target }} --verbose --features=threading ${{ env.CARGO_ARGS }}
if: runner.os == 'macOS'
- name: Build RustPython
run: cargo build --release --target=${{ matrix.platform.target }} --verbose --features=threading ${{ env.CARGO_ARGS }},jit
if: runner.os != 'macOS'
run: cargo build --release --target=${{ matrix.target }} --verbose --no-default-features --features stdlib,stdio,importlib,encodings,sqlite,host_env,ssl-rustls,threading,jit
- name: Rename Binary
run: cp target/${{ matrix.platform.target }}/release/rustpython target/rustpython-release-${{ runner.os }}-${{ matrix.platform.target }}
run: cp target/${{ matrix.target }}/release/rustpython target/rustpython-release-${{ runner.os }}-${{ matrix.target }}
if: runner.os != 'Windows'
- name: Rename Binary
run: cp target/${{ matrix.platform.target }}/release/rustpython.exe target/rustpython-release-${{ runner.os }}-${{ matrix.platform.target }}.exe
run: cp target/${{ matrix.target }}/release/rustpython.exe target/rustpython-release-${{ runner.os }}-${{ matrix.target }}.exe
if: runner.os == 'Windows'
- name: Upload Binary Artifacts
uses: actions/upload-artifact@v7.0.0
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: rustpython-release-${{ runner.os }}-${{ matrix.platform.target }}
path: target/rustpython-release-${{ runner.os }}-${{ matrix.platform.target }}*
name: rustpython-release-${{ runner.os }}-${{ matrix.target }}
path: target/rustpython-release-${{ runner.os }}-${{ matrix.target }}*
build-wasm:
runs-on: ubuntu-latest
@@ -106,16 +100,22 @@ jobs:
run: cp target/wasm32-wasip1/release/rustpython.wasm target/rustpython-release-wasm32-wasip1.wasm
- name: Upload Binary Artifacts
uses: actions/upload-artifact@v7.0.0
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: rustpython-release-wasm32-wasip1
path: target/rustpython-release-wasm32-wasip1.wasm
- name: install wasm-pack
run: curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
- uses: actions/setup-node@v6
- uses: mwilliamson/setup-wabt-action@v3
with: { wabt-version: "1.0.30" }
- uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0
with:
package-manager-cache: false
- uses: mwilliamson/setup-wabt-action@febe2a12b7ccb999a6e5d953a8362a3b7ffcf148 # v3.2.0
with:
wabt-version: "1.0.30"
- name: build demo
run: |
npm install
@@ -123,6 +123,7 @@ jobs:
env:
NODE_OPTIONS: "--openssl-legacy-provider"
working-directory: ./wasm/demo
- name: build notebook demo
run: |
npm install
@@ -131,8 +132,10 @@ jobs:
env:
NODE_OPTIONS: "--openssl-legacy-provider"
working-directory: ./wasm/notebook
- name: Deploy demo to Github Pages
uses: peaceiris/actions-gh-pages@v4
if: ${{ github.repository == 'RustPython/RustPython' }}
uses: peaceiris/actions-gh-pages@4f9cc6602d3f66b9c108549d475ec49e8ef4d45e # v4.0.0
with:
deploy_key: ${{ secrets.ACTIONS_DEMO_DEPLOY_KEY }}
publish_dir: ./wasm/demo/dist
@@ -150,26 +153,21 @@ jobs:
persist-credentials: false
- name: Download Binary Artifacts
uses: actions/download-artifact@v8.0.1
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
path: bin
pattern: rustpython-*
merge-multiple: true
- name: Create Lib Archive
run: |
zip -r bin/rustpython-lib.zip Lib/
run: zip -r bin/rustpython-lib.zip Lib/
- name: List Binaries
run: |
ls -lah bin/
file bin/*
- name: Create Release
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
tag: ${{ github.ref_name }}
run: ${{ github.run_number }}
PRE_RELEASE_INPUT: ${{ github.event.inputs.pre-release }}
run: |
if [[ "${PRE_RELEASE_INPUT}" == "false" ]]; then
RELEASE_TYPE_NAME=Release
@@ -188,3 +186,8 @@ jobs:
--generate-notes \
$PRERELEASE_ARG \
bin/rustpython-release-*
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
tag: ${{ github.ref_name }}
run: ${{ github.run_number }}
PRE_RELEASE_INPUT: ${{ github.event.inputs.pre-release }}

View File

@@ -1,8 +1,6 @@
name: Update doc DB
permissions:
contents: write
pull-requests: write
permissions: {}
on:
workflow_dispatch:
@@ -22,6 +20,8 @@ defaults:
jobs:
generate:
permissions:
contents: read
runs-on: ${{ matrix.os }}
strategy:
matrix:
@@ -54,17 +54,19 @@ jobs:
merge:
runs-on: ubuntu-latest
needs: generate
permissions:
contents: write
pull-requests: write
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: true
ref: ${{ inputs.base-ref }}
token: ${{ secrets.AUTO_COMMIT_PAT }}
- name: Create update branch
run: git switch -c "update-doc-${PYTHON_VERSION}"
env:
PYTHON_VERSION: ${{ inputs.python-version }}
run: git switch -c "update-doc-${PYTHON_VERSION}"
- name: Download generated doc DBs
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
@@ -107,7 +109,7 @@ jobs:
- name: Commit, push and create PR
env:
GH_TOKEN: ${{ secrets.AUTO_COMMIT_PAT }}
GH_TOKEN: ${{ github.token }}
PYTHON_VERSION: ${{ inputs.python-version }}
BASE_REF: ${{ inputs.base-ref }}
run: |

View File

@@ -58,7 +58,7 @@ jobs:
comment_repo: ""
steps:
- name: Setup Scripts
uses: github/gh-aw/actions/setup@08a903b1fb2e493a84a57577778fe5dd711f9468 # v0.58.3
uses: github/gh-aw/actions/setup@48d8fdfddc8cad854ac0c70ceb573f09fb8f9c9b # v0.62.5
with:
destination: /opt/gh-aw/actions
- name: Check workflow file timestamps
@@ -99,7 +99,7 @@ jobs:
secret_verification_result: ${{ steps.validate-secret.outputs.verification_result }}
steps:
- name: Setup Scripts
uses: github/gh-aw/actions/setup@08a903b1fb2e493a84a57577778fe5dd711f9468 # v0.58.3
uses: github/gh-aw/actions/setup@48d8fdfddc8cad854ac0c70ceb573f09fb8f9c9b # v0.62.5
with:
destination: /opt/gh-aw/actions
- name: Checkout repository
@@ -114,7 +114,7 @@ jobs:
run: bash /opt/gh-aw/actions/create_gh_aw_tmp_dir.sh
# Cache configuration from frontmatter processed below
- name: Cache (cpython-lib-${{ env.PYTHON_VERSION }})
uses: actions/cache@cdf6c1fa76f9f475f3d7449005a359c84ca0f306 # v5.0.3
uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
key: cpython-lib-${{ env.PYTHON_VERSION }}
path: cpython
@@ -804,7 +804,7 @@ jobs:
total_count: ${{ steps.missing_tool.outputs.total_count }}
steps:
- name: Setup Scripts
uses: github/gh-aw/actions/setup@08a903b1fb2e493a84a57577778fe5dd711f9468 # v0.58.3
uses: github/gh-aw/actions/setup@48d8fdfddc8cad854ac0c70ceb573f09fb8f9c9b # v0.62.5
with:
destination: /opt/gh-aw/actions
- name: Download agent output artifact
@@ -925,7 +925,7 @@ jobs:
success: ${{ steps.parse_results.outputs.success }}
steps:
- name: Setup Scripts
uses: github/gh-aw/actions/setup@08a903b1fb2e493a84a57577778fe5dd711f9468 # v0.58.3
uses: github/gh-aw/actions/setup@48d8fdfddc8cad854ac0c70ceb573f09fb8f9c9b # v0.62.5
with:
destination: /opt/gh-aw/actions
- name: Download agent artifacts
@@ -1037,7 +1037,7 @@ jobs:
process_safe_outputs_temporary_id_map: ${{ steps.process_safe_outputs.outputs.temporary_id_map }}
steps:
- name: Setup Scripts
uses: github/gh-aw/actions/setup@08a903b1fb2e493a84a57577778fe5dd711f9468 # v0.58.3
uses: github/gh-aw/actions/setup@48d8fdfddc8cad854ac0c70ceb573f09fb8f9c9b # v0.62.5
with:
destination: /opt/gh-aw/actions
- name: Download agent output artifact

14
.github/zizmor.yml vendored Normal file
View File

@@ -0,0 +1,14 @@
rules:
unpinned-uses:
config:
policies:
# dtolnay/rust-toolchain is a trusted action that uses lightweight branch
# refs (@stable, @nightly, etc.) by design. Pinning to a hash would break
# the intended usage pattern.
# We can remove this once https://github.com/dtolnay/rust-toolchain/issues/180 is resolved
dtolnay/rust-toolchain: any
# dtolnay/rust-toolchain handles component installation, target addition, and
# override configuration beyond what a bare `rustup` invocation provides.
# See: https://github.com/zizmorcore/zizmor/issues/1817
superfluous-actions:
disable: true

71
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,71 @@
# NOTE: Reason for not using `prek.toml` is dependabot supports `pre-commit` as an ecosystem
# See: https://github.blog/changelog/2026-03-10-dependabot-now-supports-pre-commit-hooks/
fail_fast: false
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: check-merge-conflict
priority: 0
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.15.7
hooks:
- id: ruff-format
priority: 0
- id: ruff-check
args: [--select, I, --fix, --exit-non-zero-on-fix]
types_or: [python]
require_serial: true
priority: 1
- repo: local
hooks:
- id: redundant-test-patches
name: check redundant test patches
entry: scripts/check_redundant_patches.py
files: '^Lib/test/.*\.py$'
language: script
types: [python]
priority: 0
- repo: local
hooks:
- id: rustfmt
name: rustfmt
entry: rustfmt
language: system
types: [rust]
priority: 0
- id: generate-opcode-metadata
name: generate opcode metadata
entry: python scripts/generate_opcode_metadata.py
files: '^(crates/compiler-core/src/bytecode/instruction\.rs|scripts/generate_opcode_metadata\.py)$'
pass_filenames: false
language: system
require_serial: true
priority: 1 # so rustfmt runs first
- repo: https://github.com/streetsidesoftware/cspell-cli
rev: v9.7.0
hooks:
- id: cspell
types: [rust]
additional_dependencies:
- '@cspell/dict-en_us'
- '@cspell/dict-cpp'
- '@cspell/dict-python'
- '@cspell/dict-rust'
- '@cspell/dict-win32'
- '@cspell/dict-shell'
priority: 0
- repo: https://github.com/rbubley/mirrors-prettier
rev: v3.8.1
hooks:
- id: prettier
files: '^wasm/.*$'
priority: 0

676
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -153,15 +153,23 @@ rustpython-vm = { path = "crates/vm", default-features = false, version = "0.5.0
rustpython-pylib = { path = "crates/pylib", version = "0.5.0" }
rustpython-stdlib = { path = "crates/stdlib", default-features = false, version = "0.5.0" }
rustpython-sre_engine = { path = "crates/sre_engine", version = "0.5.0" }
rustpython-unicode = { path = "crates/unicode", default-features = false, version = "0.5.0" }
rustpython-wtf8 = { path = "crates/wtf8", version = "0.5.0" }
rustpython-doc = { path = "crates/doc", version = "0.5.0" }
# Ruff tag 0.15.6 is based on commit e4c7f357777a2fdd34dbe6a98b1b7d3e7488f675
# Use RustPython-packaged Ruff crates from the published fork while keeping
# existing crate names in the codebase.
ruff_python_parser = { package = "rustpython-ruff_python_parser", version = "0.15.8" }
ruff_python_ast = { package = "rustpython-ruff_python_ast", version = "0.15.8" }
ruff_text_size = { package = "rustpython-ruff_text_size", version = "0.15.8" }
ruff_source_file = { package = "rustpython-ruff_source_file", version = "0.15.8" }
# To update ruff crates, comment out the above lines and uncomment the following lines to pull directly from the Ruff repository at the specified commit hash.
# Ruff tag 0.15.8 is based on commit c2a8815842f9dc5d24ec19385eae0f1a7188b0d9
# at the time of this capture. We use the commit hash to ensure reproducible builds.
ruff_python_parser = { git = "https://github.com/astral-sh/ruff.git", rev = "e4c7f357777a2fdd34dbe6a98b1b7d3e7488f675" }
ruff_python_ast = { git = "https://github.com/astral-sh/ruff.git", rev = "e4c7f357777a2fdd34dbe6a98b1b7d3e7488f675" }
ruff_text_size = { git = "https://github.com/astral-sh/ruff.git", rev = "e4c7f357777a2fdd34dbe6a98b1b7d3e7488f675" }
ruff_source_file = { git = "https://github.com/astral-sh/ruff.git", rev = "e4c7f357777a2fdd34dbe6a98b1b7d3e7488f675" }
# ruff_python_parser = { git = "https://github.com/astral-sh/ruff.git", rev = "c2a8815842f9dc5d24ec19385eae0f1a7188b0d9" }
# ruff_python_ast = { git = "https://github.com/astral-sh/ruff.git", rev = "c2a8815842f9dc5d24ec19385eae0f1a7188b0d9" }
# ruff_text_size = { git = "https://github.com/astral-sh/ruff.git", rev = "c2a8815842f9dc5d24ec19385eae0f1a7188b0d9" }
# ruff_source_file = { git = "https://github.com/astral-sh/ruff.git", rev = "c2a8815842f9dc5d24ec19385eae0f1a7188b0d9" }
phf = { version = "0.13.1", default-features = false, features = ["macros"]}
ahash = "0.8.12"
@@ -207,23 +215,19 @@ rand_core = { version = "0.9", features = ["os_rng"] }
rustix = { version = "1.1", features = ["event"] }
rustyline = "17.0.1"
serde = { package = "serde_core", version = "1.0.225", default-features = false, features = ["alloc"] }
schannel = "0.1.28"
schannel = "0.1.29"
scoped-tls = "1"
scopeguard = "1"
static_assertions = "1.1"
strum = "0.27"
strum = "0.28"
strum_macros = "0.28"
syn = "2"
thiserror = "2.0"
icu_properties = "2"
icu_normalizer = "2"
unicode-casing = "0.1.1"
unic-char-property = "0.9.0"
unic-normal = "0.9.0"
unic-ucd-age = "0.9.0"
unic-ucd-bidi = "0.9.0"
unic-ucd-category = "0.9.0"
unic-ucd-ident = "0.9.0"
unicode_names2 = "2.0.0"
unicode-bidi-mirroring = "0.4"
widestring = "1.2.0"
windows-sys = "0.61.2"
wasm-bindgen = "0.2.106"

View File

@@ -163,7 +163,6 @@ class OSEINTRTest(EINTRBaseTest):
self.assertEqual(os.readinto(fd, buffer), len(expected))
self.assertEqual(buffer, expected)
@unittest.expectedFailure # TODO: RUSTPYTHON; InterruptedError: [Errno 4] Interrupted system call
def test_write(self):
rd, wr = os.pipe()
self.addCleanup(os.close, wr)

View File

@@ -4813,9 +4813,9 @@ class _TestFinalize(BaseTestCase):
result = [obj for obj in iter(conn.recv, 'STOP')]
self.assertEqual(result, ['a', 'b', 'd10', 'd03', 'd02', 'd01', 'e'])
# TODO: RUSTPYTHON; SIGSEGV due to dict thread-safety issue under aggressive GC
@unittest.skip("TODO: RUSTPYTHON")
@support.requires_resource('cpu')
# TODO: RUSTPYTHON; dict iteration races with concurrent GC mutations
@unittest.expectedFailure
def test_thread_safety(self):
# bpo-24484: _run_finalizers() should be thread-safe
def cb():

View File

@@ -475,8 +475,6 @@ class CmdLineTest(unittest.TestCase):
self.assertRegex(err.decode('ascii', 'ignore'), 'SyntaxError')
self.assertEqual(b'', out)
# TODO: RUSTPYTHON
@unittest.expectedFailure
def test_stdout_flush_at_shutdown(self):
# Issue #5319: if stdout.flush() fails at shutdown, an error should
# be printed out.

View File

@@ -2486,7 +2486,6 @@ class TestSourcePositions(unittest.TestCase):
class TestStaticAttributes(unittest.TestCase):
@unittest.expectedFailure # TODO: RUSTPYTHON; AttributeError: type object 'C' has no attribute '__static_attributes__'
def test_basic(self):
class C:
def f(self):
@@ -2518,7 +2517,6 @@ class TestStaticAttributes(unittest.TestCase):
self.assertEqual(sorted(C.__static_attributes__), ['u', 'v', 'x', 'y', 'z'])
@unittest.expectedFailure # TODO: RUSTPYTHON; AttributeError: type object 'C' has no attribute '__static_attributes__'
def test_nested_class(self):
class C:
def f(self):
@@ -2533,7 +2531,6 @@ class TestStaticAttributes(unittest.TestCase):
self.assertEqual(sorted(C.__static_attributes__), ['x', 'y'])
self.assertEqual(sorted(C.D.__static_attributes__), ['y', 'z'])
@unittest.expectedFailure # TODO: RUSTPYTHON; AttributeError: type object 'C' has no attribute '__static_attributes__'
def test_subclass(self):
class C:
def f(self):
@@ -2593,7 +2590,6 @@ class TestExpressionStackSize(unittest.TestCase):
def test_set(self):
self.check_stack_size("{" + "x, " * self.N + "x}")
@unittest.expectedFailure # TODO: RUSTPYTHON; AssertionError: 202 not less than or equal to 7
def test_dict(self):
self.check_stack_size("{" + "x:x, " * self.N + "x:x}")

View File

@@ -99,8 +99,6 @@ class ContextManagerTestCase(unittest.TestCase):
raise ZeroDivisionError()
self.assertEqual(state, [1, 42, 999])
# TODO: RUSTPYTHON
@unittest.expectedFailure
def test_contextmanager_traceback(self):
@contextmanager
def f():

View File

@@ -252,7 +252,6 @@ class AsyncContextManagerTestCase(unittest.IsolatedAsyncioTestCase):
raise ZeroDivisionError(999)
self.assertEqual(state, [1, 42, 999])
@unittest.expectedFailure # TODO: RUSTPYTHON
async def test_contextmanager_except_stopiter(self):
@asynccontextmanager
async def woohoo():

View File

@@ -4987,7 +4987,6 @@ class ClassPropertiesAndMethods(unittest.TestCase):
self.assertEqual(Y.__qualname__, 'Y')
self.assertEqual(Y.Inside.__qualname__, 'Y.Inside')
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_qualname_dict(self):
ns = {'__qualname__': 'some.name'}
tp = type('Foo', (), ns)
@@ -5130,7 +5129,6 @@ class ClassPropertiesAndMethods(unittest.TestCase):
gc.collect()
self.assertEqual(Parent.__subclasses__(), [])
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_instance_method_get_behavior(self):
# test case for gh-113157
@@ -5180,7 +5178,6 @@ class DictProxyTests(unittest.TestCase):
pass
self.C = C
@unittest.expectedFailure # TODO: RUSTPYTHON
@unittest.skipIf(hasattr(sys, 'gettrace') and sys.gettrace(),
'trace function introduces __local__')
def test_iter_keys(self):
@@ -5194,7 +5191,6 @@ class DictProxyTests(unittest.TestCase):
'__static_attributes__', '__weakref__',
'meth'])
@unittest.expectedFailure # TODO: RUSTPYTHON; AssertionError: 5 != 7
@unittest.skipIf(hasattr(sys, 'gettrace') and sys.gettrace(),
'trace function introduces __local__')
def test_iter_values(self):
@@ -5204,7 +5200,6 @@ class DictProxyTests(unittest.TestCase):
values = list(it)
self.assertEqual(len(values), 7)
@unittest.expectedFailure # TODO: RUSTPYTHON
@unittest.skipIf(hasattr(sys, 'gettrace') and sys.gettrace(),
'trace function introduces __local__')
def test_iter_items(self):

View File

@@ -1134,7 +1134,6 @@ class DisTests(DisTestBase):
# Test that value is displayed for keyword argument names:
self.do_disassembly_test(wrap_func_w_kwargs, dis_kw_names)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_intrinsic_1(self):
# Test that argrepr is displayed for CALL_INTRINSIC_1
self.do_disassembly_test("from math import *", dis_intrinsic_1_2)

View File

@@ -2519,7 +2519,6 @@ class SyntaxErrorTests(unittest.TestCase):
self.assertEqual(error, the_exception.text)
self.assertEqual("bad bad", the_exception.msg)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_incorrect_constructor(self):
args = ("bad.py", 1, 2)
self.assertRaises(TypeError, SyntaxError, "bad bad", args)

13
Lib/test/test_file.py vendored
View File

@@ -126,7 +126,7 @@ class AutoFileTests:
# it must also return None if an exception was given
try:
1/0
except:
except ZeroDivisionError:
self.assertEqual(self.f.__exit__(*sys.exc_info()), None)
def testReadWhenWriting(self):
@@ -216,6 +216,16 @@ class OtherFileTests:
with self.assertWarnsRegex(RuntimeWarning, 'line buffering'):
self._checkBufferSize(1)
def testDefaultBufferSize(self):
with self.open(TESTFN, 'wb') as f:
blksize = f.raw._blksize
f.write(b"\0" * 5_000_000)
with self.open(TESTFN, 'rb') as f:
data = f.read1()
expected_size = max(min(blksize, 8192 * 1024), io.DEFAULT_BUFFER_SIZE)
self.assertEqual(len(data), expected_size)
def testTruncateOnWindows(self):
# SF bug <https://bugs.python.org/issue801631>
# "file.truncate fault on windows"
@@ -344,7 +354,6 @@ class OtherFileTests:
class COtherFileTests(OtherFileTests, unittest.TestCase):
open = io.open
class PyOtherFileTests(OtherFileTests, unittest.TestCase):
open = staticmethod(pyio.open)

View File

@@ -385,7 +385,6 @@ class PluralFormsTests:
x = ngettext(singular, plural, None)
self.assertEqual(x, tplural)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_plural_forms(self):
self._test_plural_forms(
self.ngettext, self.gettext,
@@ -396,7 +395,6 @@ class PluralFormsTests:
'%d file deleted', '%d files deleted',
'%d file deleted', '%d files deleted')
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_plural_context_forms(self):
ngettext = partial(self.npgettext, 'With context')
gettext = partial(self.pgettext, 'With context')
@@ -409,7 +407,6 @@ class PluralFormsTests:
'%d file deleted', '%d files deleted',
'%d file deleted', '%d files deleted')
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_plural_wrong_context_forms(self):
self._test_plural_forms(
partial(self.npgettext, 'Unknown context'),
@@ -442,7 +439,6 @@ class GNUTranslationsWithDomainPluralFormsTestCase(PluralFormsTests, GettextBase
self.pgettext = partial(gettext.dpgettext, 'gettext')
self.npgettext = partial(gettext.dnpgettext, 'gettext')
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_plural_forms_wrong_domain(self):
self._test_plural_forms(
partial(gettext.dngettext, 'unknown'),
@@ -451,7 +447,6 @@ class GNUTranslationsWithDomainPluralFormsTestCase(PluralFormsTests, GettextBase
'There is %s file', 'There are %s files',
numbers_only=False)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_plural_context_forms_wrong_domain(self):
self._test_plural_forms(
partial(gettext.dnpgettext, 'unknown', 'With context'),
@@ -472,7 +467,6 @@ class GNUTranslationsClassPluralFormsTestCase(PluralFormsTests, GettextBaseTest)
self.pgettext = t.pgettext
self.npgettext = t.npgettext
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_plural_forms_null_translations(self):
t = gettext.NullTranslations()
self._test_plural_forms(
@@ -481,7 +475,6 @@ class GNUTranslationsClassPluralFormsTestCase(PluralFormsTests, GettextBaseTest)
'There is %s file', 'There are %s files',
numbers_only=False)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_plural_context_forms_null_translations(self):
t = gettext.NullTranslations()
self._test_plural_forms(

View File

@@ -961,7 +961,6 @@ class TestGettingSourceOfToplevelFrames(GetSourceBase):
class TestDecorators(GetSourceBase):
fodderModule = mod2
@unittest.expectedFailure # TODO: RUSTPYTHON; pass
def test_wrapped_decorator(self):
self.assertSourceEqual(mod2.wrapped, 14, 17)
@@ -1259,7 +1258,6 @@ class TestNoEOL(GetSourceBase):
class TestComplexDecorator(GetSourceBase):
fodderModule = mod2
@unittest.expectedFailure # TODO: RUSTPYTHON; return foo + bar()
def test_parens_in_decorator(self):
self.assertSourceEqual(self.fodderModule.complex_decorated, 273, 275)

View File

@@ -2,13 +2,12 @@
"""
import os
import stat
import sys
import unittest
import socket
import shutil
import threading
from test.support import requires, bigmemtest
from test.support import requires, bigmemtest, requires_resource
from test.support import SHORT_TIMEOUT
from test.support import socket_helper
from test.support.os_helper import TESTFN, unlink
@@ -29,7 +28,7 @@ class LargeFileTest:
mode = 'w+b'
with self.open(TESTFN, mode) as f:
current_size = os.fstat(f.fileno())[stat.ST_SIZE]
current_size = os.fstat(f.fileno()).st_size
if current_size == size+1:
return
@@ -40,13 +39,13 @@ class LargeFileTest:
f.seek(size)
f.write(b'a')
f.flush()
self.assertEqual(os.fstat(f.fileno())[stat.ST_SIZE], size+1)
self.assertEqual(os.fstat(f.fileno()).st_size, size+1)
@classmethod
def tearDownClass(cls):
with cls.open(TESTFN, 'wb'):
pass
if not os.stat(TESTFN)[stat.ST_SIZE] == 0:
if not os.stat(TESTFN).st_size == 0:
raise cls.failureException('File was not truncated by opening '
'with mode "wb"')
unlink(TESTFN2)
@@ -67,7 +66,7 @@ class TestFileMethods(LargeFileTest):
self.assertEqual(f.tell(), size + 1)
def test_osstat(self):
self.assertEqual(os.stat(TESTFN)[stat.ST_SIZE], size+1)
self.assertEqual(os.stat(TESTFN).st_size, size+1)
def test_seek_read(self):
with self.open(TESTFN, 'rb') as f:
@@ -142,6 +141,9 @@ class TestFileMethods(LargeFileTest):
f.truncate(1)
self.assertEqual(f.tell(), 0) # else pointer moved
f.seek(0)
# Verify readall on a truncated file is well behaved. read()
# without a size can be unbounded, this should get just the byte
# that remains.
self.assertEqual(len(f.read()), 1) # else wasn't truncated
def test_seekable(self):
@@ -152,6 +154,22 @@ class TestFileMethods(LargeFileTest):
f.seek(pos)
self.assertTrue(f.seekable())
@bigmemtest(size=size, memuse=2, dry_run=False)
def test_seek_readall(self, _size):
# Seek which doesn't change position should readall successfully.
with self.open(TESTFN, 'rb') as f:
self.assertEqual(f.seek(0, os.SEEK_CUR), 0)
self.assertEqual(len(f.read()), size + 1)
# Seek which changes (or might change) position should readall
# successfully.
with self.open(TESTFN, 'rb') as f:
self.assertEqual(f.seek(20, os.SEEK_SET), 20)
self.assertEqual(len(f.read()), size - 19)
with self.open(TESTFN, 'rb') as f:
self.assertEqual(f.seek(-3, os.SEEK_END), size - 2)
self.assertEqual(len(f.read()), 3)
def skip_no_disk_space(path, required):
def decorator(fun):
@@ -173,6 +191,7 @@ class TestCopyfile(LargeFileTest, unittest.TestCase):
# Exact required disk space would be (size * 2), but let's give it a
# bit more tolerance.
@skip_no_disk_space(TESTFN, size * 2.5)
@requires_resource('cpu')
def test_it(self):
# Internally shutil.copyfile() can use "fast copy" methods like
# os.sendfile().
@@ -222,6 +241,7 @@ class TestSocketSendfile(LargeFileTest, unittest.TestCase):
# Exact required disk space would be (size * 2), but let's give it a
# bit more tolerance.
@skip_no_disk_space(TESTFN, size * 2.5)
@requires_resource('cpu')
def test_it(self):
port = socket_helper.find_unused_port()
with socket.create_server(("", port)) as sock:

View File

@@ -49,7 +49,6 @@ class IntTestCase(unittest.TestCase, HelperMixin):
self.helper(expected)
n = n >> 1
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_int64(self):
# Simulate int marshaling with TYPE_INT64.
maxint64 = (1 << 63) - 1
@@ -141,7 +140,6 @@ class CodeTestCase(unittest.TestCase):
self.assertEqual(co1.co_filename, "f1")
self.assertEqual(co2.co_filename, "f2")
@unittest.expectedFailure # TODO: RUSTPYTHON; TypeError: Unexpected keyword argument allow_code
def test_no_allow_code(self):
data = {'a': [({0},)]}
dump = marshal.dumps(data, allow_code=False)
@@ -234,14 +232,12 @@ class BufferTestCase(unittest.TestCase, HelperMixin):
new = marshal.loads(marshal.dumps(b))
self.assertEqual(type(new), bytes)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_memoryview(self):
b = memoryview(b"abc")
self.helper(b)
new = marshal.loads(marshal.dumps(b))
self.assertEqual(type(new), bytes)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_array(self):
a = array.array('B', b"abc")
new = marshal.loads(marshal.dumps(a))
@@ -274,7 +270,6 @@ class BugsTestCase(unittest.TestCase):
except Exception:
pass
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_loads_recursion(self):
def run_tests(N, check):
# (((...None...),),)
@@ -295,7 +290,7 @@ class BugsTestCase(unittest.TestCase):
run_tests(2**20, check)
@unittest.skipIf(support.is_android, "TODO: RUSTPYTHON; segfault")
@unittest.expectedFailure # TODO: RUSTPYTHON; segfault
@unittest.skipIf(os.name == 'nt', "TODO: RUSTPYTHON; write depth limit is 2000 not 1000")
def test_recursion_limit(self):
# Create a deeply nested structure.
head = last = []
@@ -324,7 +319,6 @@ class BugsTestCase(unittest.TestCase):
last.append([0])
self.assertRaises(ValueError, marshal.dumps, head)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_exact_type_match(self):
# Former bug:
# >>> class Int(int): pass
@@ -348,7 +342,6 @@ class BugsTestCase(unittest.TestCase):
invalid_string = b'l\x02\x00\x00\x00\x00\x00\x00\x00'
self.assertRaises(ValueError, marshal.loads, invalid_string)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_multiple_dumps_and_loads(self):
# Issue 12291: marshal.load() should be callable multiple times
# with interleaved data written by non-marshal code
@@ -532,66 +525,56 @@ class InstancingTestCase(unittest.TestCase, HelperMixin):
else:
self.assertGreaterEqual(len(s2), len(s3))
@unittest.expectedFailure # TODO: RUSTPYTHON
def testInt(self):
intobj = 123321
self.helper(intobj)
self.helper3(intobj, simple=True)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testFloat(self):
floatobj = 1.2345
self.helper(floatobj)
self.helper3(floatobj)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testStr(self):
strobj = "abcde"*3
self.helper(strobj)
self.helper3(strobj)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testBytes(self):
bytesobj = b"abcde"*3
self.helper(bytesobj)
self.helper3(bytesobj)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testList(self):
for obj in self.keys:
listobj = [obj, obj]
self.helper(listobj)
self.helper3(listobj)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testTuple(self):
for obj in self.keys:
tupleobj = (obj, obj)
self.helper(tupleobj)
self.helper3(tupleobj)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testSet(self):
for obj in self.keys:
setobj = {(obj, 1), (obj, 2)}
self.helper(setobj)
self.helper3(setobj)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testFrozenSet(self):
for obj in self.keys:
frozensetobj = frozenset({(obj, 1), (obj, 2)})
self.helper(frozensetobj)
self.helper3(frozensetobj)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testDict(self):
for obj in self.keys:
dictobj = {"hello": obj, "goodbye": obj, obj: "hello"}
self.helper(dictobj)
self.helper3(dictobj)
@unittest.expectedFailure # TODO: RUSTPYTHON
def testModule(self):
with open(__file__, "rb") as f:
code = f.read()
@@ -651,7 +634,6 @@ class InterningTestCase(unittest.TestCase, HelperMixin):
self.assertNotEqual(id(s2), id(s))
class SliceTestCase(unittest.TestCase, HelperMixin):
@unittest.expectedFailure # TODO: RUSTPYTHON; NotImplementedError: TODO: not implemented yet or marshal unsupported type
def test_slice(self):
for obj in (
slice(None), slice(1), slice(1, 2), slice(1, 2, 3),

View File

@@ -867,7 +867,6 @@ class MmapTests(unittest.TestCase):
finally:
f.close()
@unittest.expectedFailure # TODO: RUSTPYTHON
@unittest.skipUnless(os.name == 'nt', 'requires Windows')
def test_resize_succeeds_with_error_for_second_named_mapping(self):
"""If a more than one mapping exists of the same name, none of them can

View File

@@ -1,4 +1,5 @@
# Test the module type
import importlib.machinery
import unittest
import weakref
from test.support import gc_collect
@@ -29,7 +30,7 @@ class ModuleTests(unittest.TestCase):
self.fail("__name__ = %s" % repr(s))
except AttributeError:
pass
self.assertEqual(foo.__doc__, ModuleType.__doc__)
self.assertEqual(foo.__doc__, ModuleType.__doc__ or '')
def test_uninitialized_missing_getattr(self):
# Issue 8297
@@ -102,8 +103,7 @@ class ModuleTests(unittest.TestCase):
gc_collect()
self.assertEqual(f().__dict__["bar"], 4)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_clear_dict_in_ref_cycle(self):
destroyed = []
m = ModuleType("foo")
@@ -152,15 +152,13 @@ a = A(destroyed)"""
if 'test.test_module.bad_getattr2' in sys.modules:
del sys.modules['test.test_module.bad_getattr2']
# TODO: RUSTPYTHON
@unittest.expectedFailure
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_module_dir(self):
import test.test_module.good_getattr as gga
self.assertEqual(dir(gga), ['a', 'b', 'c'])
del sys.modules['test.test_module.good_getattr']
# TODO: RUSTPYTHON
@unittest.expectedFailure
@unittest.expectedFailure # TODO: RUSTPYTHON
def test_module_dir_errors(self):
import test.test_module.bad_getattr as bga
from test.test_module import bad_getattr2
@@ -270,11 +268,38 @@ a = A(destroyed)"""
self.assertEqual(r[-len(ends_with):], ends_with,
'{!r} does not end with {!r}'.format(r, ends_with))
# TODO: RUSTPYTHON
@unittest.expectedFailure
def test_module_repr_with_namespace_package(self):
m = ModuleType('foo')
loader = importlib.machinery.NamespaceLoader('foo', ['bar'], 'baz')
spec = importlib.machinery.ModuleSpec('foo', loader)
m.__loader__ = loader
m.__spec__ = spec
self.assertEqual(repr(m), "<module 'foo' (namespace) from ['bar']>")
def test_module_repr_with_namespace_package_and_custom_loader(self):
m = ModuleType('foo')
loader = BareLoader()
spec = importlib.machinery.ModuleSpec('foo', loader)
m.__loader__ = loader
m.__spec__ = spec
expected_repr_pattern = r"<module 'foo' \(<.*\.BareLoader object at .+>\)>"
self.assertRegex(repr(m), expected_repr_pattern)
self.assertNotIn('from', repr(m))
def test_module_repr_with_fake_namespace_package(self):
m = ModuleType('foo')
loader = BareLoader()
loader._path = ['spam']
spec = importlib.machinery.ModuleSpec('foo', loader)
m.__loader__ = loader
m.__spec__ = spec
expected_repr_pattern = r"<module 'foo' \(<.*\.BareLoader object at .+>\)>"
self.assertRegex(repr(m), expected_repr_pattern)
self.assertNotIn('from', repr(m))
def test_module_finalization_at_shutdown(self):
# Module globals and builtins should still be available during shutdown
rc, out, err = assert_python_ok("-c", "from test import final_a")
rc, out, err = assert_python_ok("-c", "from test.test_module import final_a")
self.assertFalse(err)
lines = out.splitlines()
self.assertEqual(set(lines), {

19
Lib/test/test_module/final_a.py vendored Normal file
View File

@@ -0,0 +1,19 @@
"""
Fodder for module finalization tests in test_module.
"""
import shutil
import test.test_module.final_b
x = 'a'
class C:
def __del__(self):
# Inspect module globals and builtins
print("x =", x)
print("final_b.x =", test.test_module.final_b.x)
print("shutil.rmtree =", getattr(shutil.rmtree, '__name__', None))
print("len =", getattr(len, '__name__', None))
c = C()
_underscored = C()

19
Lib/test/test_module/final_b.py vendored Normal file
View File

@@ -0,0 +1,19 @@
"""
Fodder for module finalization tests in test_module.
"""
import shutil
import test.test_module.final_a
x = 'b'
class C:
def __del__(self):
# Inspect module globals and builtins
print("x =", x)
print("final_a.x =", test.test_module.final_a.x)
print("shutil.rmtree =", getattr(shutil.rmtree, '__name__', None))
print("len =", getattr(len, '__name__', None))
c = C()
_underscored = C()

View File

@@ -1261,7 +1261,6 @@ class TestLineAndInstructionEvents(CheckEvents):
('instruction', 'func2', 46),
('line', 'get_events', 11)])
@unittest.expectedFailure # TODO: RUSTPYTHON; - instruction offsets differ from CPython
def test_try_except(self):
def func3():

File diff suppressed because it is too large Load Diff

90
Lib/test/test_optimizer.py vendored Normal file
View File

@@ -0,0 +1,90 @@
import unittest
import types
from test.support import import_helper
_testinternalcapi = import_helper.import_module("_testinternalcapi")
class TestRareEventCounters(unittest.TestCase):
def setUp(self):
_testinternalcapi.reset_rare_event_counters()
def test_set_class(self):
class A:
pass
class B:
pass
a = A()
orig_counter = _testinternalcapi.get_rare_event_counters()["set_class"]
a.__class__ = B
self.assertEqual(
orig_counter + 1,
_testinternalcapi.get_rare_event_counters()["set_class"]
)
def test_set_bases(self):
class A:
pass
class B:
pass
class C(B):
pass
orig_counter = _testinternalcapi.get_rare_event_counters()["set_bases"]
C.__bases__ = (A,)
self.assertEqual(
orig_counter + 1,
_testinternalcapi.get_rare_event_counters()["set_bases"]
)
def test_set_eval_frame_func(self):
orig_counter = _testinternalcapi.get_rare_event_counters()["set_eval_frame_func"]
_testinternalcapi.set_eval_frame_record([])
self.assertEqual(
orig_counter + 1,
_testinternalcapi.get_rare_event_counters()["set_eval_frame_func"]
)
_testinternalcapi.set_eval_frame_default()
def test_builtin_dict(self):
orig_counter = _testinternalcapi.get_rare_event_counters()["builtin_dict"]
if isinstance(__builtins__, types.ModuleType):
builtins = __builtins__.__dict__
else:
builtins = __builtins__
builtins["FOO"] = 42
self.assertEqual(
orig_counter + 1,
_testinternalcapi.get_rare_event_counters()["builtin_dict"]
)
del builtins["FOO"]
def test_func_modification(self):
def func(x=0):
pass
for attribute in (
"__code__",
"__defaults__",
"__kwdefaults__"
):
orig_counter = _testinternalcapi.get_rare_event_counters()["func_modification"]
setattr(func, attribute, getattr(func, attribute))
self.assertEqual(
orig_counter + 1,
_testinternalcapi.get_rare_event_counters()["func_modification"]
)
class TestOptimizerSymbols(unittest.TestCase):
@unittest.skipUnless(hasattr(_testinternalcapi, "uop_symbols_test"),
"requires _testinternalcapi.uop_symbols_test")
def test_optimizer_symbols(self):
_testinternalcapi.uop_symbols_test()
if __name__ == "__main__":
unittest.main()

View File

@@ -132,7 +132,6 @@ class TestTranforms(BytecodeTestCase):
self.assertInBytecode(f, 'LOAD_CONST', None)
self.check_lnotab(f)
@unittest.expectedFailure # TODO: RUSTPYTHON; RETURN_VALUE
def test_while_one(self):
# Skip over: LOAD_CONST trueconst POP_JUMP_IF_FALSE xx
def f():
@@ -530,7 +529,6 @@ class TestTranforms(BytecodeTestCase):
self.assertEqual(len(returns), 1)
self.check_lnotab(f)
@unittest.expectedFailure # TODO: RUSTPYTHON; KeyError: 20
def test_elim_jump_to_return(self):
# JUMP_FORWARD to RETURN --> RETURN
def f(cond, true_value, false_value):
@@ -545,7 +543,6 @@ class TestTranforms(BytecodeTestCase):
self.assertEqual(len(returns), 2)
self.check_lnotab(f)
@unittest.expectedFailure # TODO: RUSTPYTHON; absolute jump encoding
def test_elim_jump_to_uncond_jump(self):
# POP_JUMP_IF_FALSE to JUMP_FORWARD --> POP_JUMP_IF_FALSE to non-jump
def f():
@@ -612,7 +609,6 @@ class TestTranforms(BytecodeTestCase):
print(i)
self.check_jump_targets(f)
@unittest.expectedFailure # TODO: RUSTPYTHON; 611 JUMP_BACKWARD 16
def test_elim_jump_after_return1(self):
# Eliminate dead code: jumps immediately after returns can't be reached
def f(cond1, cond2):
@@ -646,7 +642,6 @@ class TestTranforms(BytecodeTestCase):
self.assertEqual(count_instr_recursively(containtest, 'BUILD_LIST'), 0)
self.check_lnotab(containtest)
@unittest.expectedFailure # TODO: RUSTPYTHON; no BUILD_LIST to BUILD_TUPLE optimization
def test_iterate_literal_list(self):
def forloop():
for x in [a, b]:
@@ -863,7 +858,6 @@ class TestMarkingVariablesAsUnKnown(BytecodeTestCase):
self.addCleanup(sys.settrace, sys.gettrace())
sys.settrace(None)
@unittest.expectedFailure # TODO: RUSTPYTHON; BINARY_OP 0 (+)
def test_load_fast_known_simple(self):
def f():
x = 1

23
Lib/test/test_perfmaps.py vendored Normal file
View File

@@ -0,0 +1,23 @@
import os
import sys
import unittest
try:
from _testinternalcapi import perf_map_state_teardown, write_perf_map_entry
except ImportError:
raise unittest.SkipTest("requires _testinternalcapi")
if sys.platform != 'linux':
raise unittest.SkipTest('Linux only')
class TestPerfMapWriting(unittest.TestCase):
def test_write_perf_map_entry(self):
self.assertEqual(write_perf_map_entry(0x1234, 5678, "entry1"), 0)
self.assertEqual(write_perf_map_entry(0x2345, 6789, "entry2"), 0)
with open(f"/tmp/perf-{os.getpid()}.map") as f:
perf_file_contents = f.read()
self.assertIn("1234 162e entry1", perf_file_contents)
self.assertIn("2345 1a85 entry2", perf_file_contents)
perf_map_state_teardown()

View File

@@ -692,7 +692,6 @@ class ScopeTests(unittest.TestCase):
self.assertEqual(c.dec(), 1)
self.assertEqual(c.dec(), 0)
@unittest.expectedFailure # TODO: RUSTPYTHON; figure out how to communicate that `y = 9` should be stored as a global rather than a STORE_NAME, even when the `global y` is in a nested subscope
def testGlobalInParallelNestedFunctions(self):
# A symbol table bug leaked the global statement from one
# function to other nested functions in the same block.

View File

@@ -854,6 +854,7 @@ class StrTest(string_tests.StringLikeTest,
self.assertTrue('\U0001F46F'.isprintable())
self.assertFalse('\U000E0020'.isprintable())
@unittest.expectedFailure # TODO: RUSTPYTHON
@support.requires_resource('cpu')
def test_isprintable_invariant(self):
for codepoint in range(sys.maxunicode + 1):

View File

@@ -1903,7 +1903,6 @@ class RunFuncTestCase(BaseTestCase):
res = subprocess.run(args)
self.assertEqual(res.returncode, 57)
@unittest.skipIf(mswindows, "TODO: RUSTPYTHON; empty env block fails nondeterministically")
@unittest.skipUnless(mswindows, "Maybe test trigger a leak on Ubuntu")
def test_run_with_an_empty_env(self):
# gh-105436: fix subprocess.run(..., env={}) broken on Windows

View File

@@ -209,7 +209,6 @@ class TestSuper(unittest.TestCase):
self.assertIs(test_class, A)
@unittest.expectedFailure # TODO: RUSTPYTHON
def test___classcell___expected_behaviour(self):
# See issue #23722
class Meta(type):

View File

@@ -878,7 +878,6 @@ class SysModuleTest(unittest.TestCase):
def test_sys_version_info_no_instantiation(self):
self.assert_raise_on_new_sys_type(sys.version_info)
@unittest.expectedFailure # TODO: RUSTPYTHON; TypeError not raised for getwindowsversion instantiation
def test_sys_getwindowsversion_no_instantiation(self):
# Skip if not being run on Windows.
test.support.get_attribute(sys, "getwindowsversion")

View File

@@ -1420,8 +1420,6 @@ class JumpTestCase(unittest.TestCase):
output.append(6)
output.append(7)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(4, 5, [3, 5])
async def test_jump_out_of_async_for_block_forwards(output):
for i in [1]:
@@ -1430,8 +1428,6 @@ class JumpTestCase(unittest.TestCase):
output.append(4)
output.append(5)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(5, 2, [2, 4, 2, 4, 5, 6])
async def test_jump_out_of_async_for_block_backwards(output):
for i in [1]:
@@ -1539,8 +1535,6 @@ class JumpTestCase(unittest.TestCase):
output.append(2)
output.append(3)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(2, 3, [1, 3])
async def test_jump_forwards_out_of_async_with_block(output):
async with asynctracecontext(output, 1):
@@ -1553,8 +1547,6 @@ class JumpTestCase(unittest.TestCase):
with tracecontext(output, 2):
output.append(3)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(3, 1, [1, 2, 1, 2, 3, -2])
async def test_jump_backwards_out_of_async_with_block(output):
output.append(1)
@@ -1624,8 +1616,6 @@ class JumpTestCase(unittest.TestCase):
with tracecontext(output, 4):
output.append(5)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(2, 4, [1, 4, 5, -4])
async def test_jump_across_async_with(output):
output.append(1)
@@ -1643,8 +1633,6 @@ class JumpTestCase(unittest.TestCase):
output.append(5)
output.append(6)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(4, 5, [1, 3, 5, 6])
async def test_jump_out_of_async_with_block_within_for_block(output):
output.append(1)
@@ -1663,8 +1651,6 @@ class JumpTestCase(unittest.TestCase):
output.append(5)
output.append(6)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(4, 5, [1, 2, 3, 5, -2, 6])
async def test_jump_out_of_async_with_block_within_with_block(output):
output.append(1)
@@ -1684,8 +1670,6 @@ class JumpTestCase(unittest.TestCase):
output.append(6)
output.append(7)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(5, 6, [2, 4, 6, 7])
async def test_jump_out_of_async_with_block_within_finally_block(output):
try:
@@ -1719,8 +1703,6 @@ class JumpTestCase(unittest.TestCase):
output.append(4)
output.append(5)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(3, 5, [1, 2, 5])
async def test_jump_out_of_async_with_assignment(output):
output.append(1)
@@ -1768,8 +1750,6 @@ class JumpTestCase(unittest.TestCase):
output.append(7)
output.append(8)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(1, 7, [7, 8])
async def test_jump_over_async_for_block_before_else(output):
output.append(1)
@@ -2053,8 +2033,6 @@ class JumpTestCase(unittest.TestCase):
with tracecontext(output, 4):
output.append(5)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@async_jump_test(3, 5, [1, 2, 5, -2])
async def test_jump_between_async_with_blocks(output):
output.append(1)
@@ -2063,8 +2041,6 @@ class JumpTestCase(unittest.TestCase):
async with asynctracecontext(output, 4):
output.append(5)
# TODO: RUSTPYTHON
@unittest.expectedFailure
@jump_test(5, 7, [2, 4], (ValueError, "after"))
def test_no_jump_over_return_out_of_finally_block(output):
try:

View File

@@ -232,7 +232,6 @@ class UnicodeFunctionsTest(UnicodeDatabaseTest):
b = 'C\u0338' * 20 + '\xC7'
self.assertEqual(self.db.normalize('NFC', a), b)
@unittest.expectedFailure # TODO: RUSTPYTHON; ? +
def test_issue29456(self):
# Fix #29456
u1176_str_a = '\u1100\u1176\u11a8'
@@ -389,6 +388,7 @@ class NormalizationTest(unittest.TestCase):
data = [int(x, 16) for x in data.split(" ")]
return "".join([chr(x) for x in data])
@unittest.expectedFailure # TODO: RUSTPYTHON; AssertionError: False is not true : 13055
@requires_resource('network')
@requires_resource('cpu')
def test_normalization(self):

View File

@@ -14,6 +14,7 @@ std = ["thiserror/std", "itertools/use_std"]
[dependencies]
rustpython-compiler-core = { workspace = true }
rustpython-unicode = { workspace = true, default-features = false }
rustpython-literal = {workspace = true }
rustpython-wtf8 = { workspace = true }
ruff_python_ast = { workspace = true }
@@ -29,7 +30,6 @@ num-traits = { workspace = true }
thiserror = { workspace = true }
malachite-bigint = { workspace = true }
memchr = { workspace = true }
unicode_names2 = { workspace = true }
[dev-dependencies]
ruff_python_parser = { workspace = true }

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,21 +1,23 @@
---
source: crates/codegen/src/compile.rs
assertion_line: 9100
assertion_line: 9553
expression: "compile_exec(\"\\\nif True and False and False:\n pass\n\")"
---
1 0 RESUME (0)
>> 1 LOAD_CONST (True)
2 POP_JUMP_IF_FALSE (9)
3 CACHE
1 LOAD_CONST (True)
2 POP_JUMP_IF_FALSE (11)
>> 3 CACHE
4 NOT_TAKEN
>> 5 LOAD_CONST (False)
6 POP_JUMP_IF_FALSE (5)
7 CACHE
5 LOAD_CONST (False)
6 POP_JUMP_IF_FALSE (7)
>> 7 CACHE
8 NOT_TAKEN
>> 9 LOAD_CONST (False)
10 POP_JUMP_IF_FALSE (1)
11 CACHE
9 LOAD_CONST (False)
10 POP_JUMP_IF_FALSE (3)
>> 11 CACHE
12 NOT_TAKEN
2 13 LOAD_CONST (None)
14 RETURN_VALUE
15 LOAD_CONST (None)
16 RETURN_VALUE

View File

@@ -1,25 +1,27 @@
---
source: crates/codegen/src/compile.rs
assertion_line: 9110
assertion_line: 9563
expression: "compile_exec(\"\\\nif (True and False) or (False and True):\n pass\n\")"
---
1 0 RESUME (0)
>> 1 LOAD_CONST (True)
1 LOAD_CONST (True)
2 POP_JUMP_IF_FALSE (5)
3 CACHE
>> 3 CACHE
4 NOT_TAKEN
>> 5 LOAD_CONST (False)
6 POP_JUMP_IF_TRUE (9)
7 CACHE
>> 7 CACHE
8 NOT_TAKEN
>> 9 LOAD_CONST (False)
10 POP_JUMP_IF_FALSE (5)
10 POP_JUMP_IF_FALSE (7)
11 CACHE
12 NOT_TAKEN
13 LOAD_CONST (True)
14 POP_JUMP_IF_FALSE (1)
14 POP_JUMP_IF_FALSE (3)
15 CACHE
16 NOT_TAKEN
2 17 LOAD_CONST (None)
18 RETURN_VALUE
19 LOAD_CONST (None)
20 RETURN_VALUE

View File

@@ -1,21 +1,23 @@
---
source: crates/codegen/src/compile.rs
assertion_line: 9090
assertion_line: 9543
expression: "compile_exec(\"\\\nif True or False or False:\n pass\n\")"
---
1 0 RESUME (0)
>> 1 LOAD_CONST (True)
1 LOAD_CONST (True)
2 POP_JUMP_IF_TRUE (9)
3 CACHE
>> 3 CACHE
4 NOT_TAKEN
>> 5 LOAD_CONST (False)
6 POP_JUMP_IF_TRUE (5)
7 CACHE
8 NOT_TAKEN
>> 9 LOAD_CONST (False)
10 POP_JUMP_IF_FALSE (1)
10 POP_JUMP_IF_FALSE (3)
11 CACHE
12 NOT_TAKEN
2 13 LOAD_CONST (None)
14 RETURN_VALUE
15 LOAD_CONST (None)
16 RETURN_VALUE

View File

@@ -1,5 +1,6 @@
---
source: crates/codegen/src/compile.rs
assertion_line: 9688
expression: "compile_exec(\"\\\nx = Test() and False or False\n\")"
---
1 0 RESUME (0)
@@ -9,18 +10,26 @@ expression: "compile_exec(\"\\\nx = Test() and False or False\n\")"
4 CACHE
5 CACHE
6 CACHE
>> 7 COPY (1)
8 POP_JUMP_IF_FALSE (7)
7 COPY (1)
8 TO_BOOL
9 CACHE
10 NOT_TAKEN
11 POP_TOP
12 LOAD_CONST (False)
13 COPY (1)
14 POP_JUMP_IF_TRUE (3)
15 CACHE
16 NOT_TAKEN
17 POP_TOP
18 LOAD_CONST (False)
19 STORE_NAME (1, x)
20 LOAD_CONST (None)
21 RETURN_VALUE
10 CACHE
>> 11 CACHE
12 POP_JUMP_IF_FALSE (11)
13 CACHE
14 NOT_TAKEN
15 POP_TOP
16 LOAD_CONST (False)
17 COPY (1)
18 TO_BOOL
19 CACHE
20 CACHE
21 CACHE
22 POP_JUMP_IF_TRUE (3)
23 CACHE
24 NOT_TAKEN
25 POP_TOP
26 LOAD_CONST (False)
27 STORE_NAME (1, x)
28 LOAD_CONST (None)
29 RETURN_VALUE

View File

@@ -1,6 +1,5 @@
---
source: crates/codegen/src/compile.rs
assertion_line: 9089
expression: "compile_exec(\"\\\nasync def test():\n for stop_exc in (StopIteration('spam'), StopAsyncIteration('ham')):\n with self.subTest(type=type(stop_exc)):\n try:\n async with egg():\n raise stop_exc\n except Exception as ex:\n self.assertIs(ex, stop_exc)\n else:\n self.fail(f'{stop_exc} was suppressed')\n\")"
---
1 0 RESUME (0)
@@ -23,7 +22,7 @@ expression: "compile_exec(\"\\\nasync def test():\n for stop_exc in (StopIter
15 CACHE
16 CACHE
17 CACHE
>> 18 LOAD_CONST ("ham")
18 LOAD_CONST ("ham")
19 CALL (1)
20 CACHE
21 CACHE
@@ -32,15 +31,15 @@ expression: "compile_exec(\"\\\nasync def test():\n for stop_exc in (StopIter
24 GET_ITER
25 FOR_ITER (71)
26 CACHE
>> 27 STORE_FAST (0, stop_exc)
27 STORE_FAST (0, stop_exc)
3 >> 28 LOAD_GLOBAL (4, self)
3 28 LOAD_GLOBAL (4, self)
29 CACHE
30 CACHE
31 CACHE
32 CACHE
>> 32 CACHE
33 LOAD_ATTR (7, subTest, method=true)
>> 34 CACHE
34 CACHE
35 CACHE
36 CACHE
37 CACHE
@@ -53,8 +52,8 @@ expression: "compile_exec(\"\\\nasync def test():\n for stop_exc in (StopIter
44 CACHE
45 CACHE
46 CACHE
>> 47 CACHE
48 LOAD_FAST (0, stop_exc)
47 CACHE
>> 48 LOAD_FAST (0, stop_exc)
49 CALL (1)
50 CACHE
51 CACHE
@@ -67,8 +66,8 @@ expression: "compile_exec(\"\\\nasync def test():\n for stop_exc in (StopIter
58 COPY (1)
59 LOAD_SPECIAL (__exit__)
60 SWAP (2)
61 LOAD_SPECIAL (__enter__)
62 PUSH_NULL
61 SWAP (3)
62 LOAD_SPECIAL (__enter__)
63 CALL (0)
64 CACHE
65 CACHE
@@ -89,8 +88,8 @@ expression: "compile_exec(\"\\\nasync def test():\n for stop_exc in (StopIter
78 COPY (1)
79 LOAD_SPECIAL (__aexit__)
80 SWAP (2)
81 LOAD_SPECIAL (__aenter__)
82 PUSH_NULL
81 SWAP (3)
82 LOAD_SPECIAL (__aenter__)
83 CALL (0)
84 CACHE
85 CACHE
@@ -115,162 +114,141 @@ expression: "compile_exec(\"\\\nasync def test():\n for stop_exc in (StopIter
5 102 CLEANUP_THROW
103 JUMP_BACKWARD_NO_INTERRUPT(10)
6 104 NOP
5 105 PUSH_NULL
106 LOAD_CONST (None)
104 PUSH_EXC_INFO
105 WITH_EXCEPT_START
106 GET_AWAITABLE (2)
107 LOAD_CONST (None)
108 LOAD_CONST (None)
109 CALL (3)
110 CACHE
111 CACHE
112 CACHE
113 GET_AWAITABLE (2)
114 LOAD_CONST (None)
115 SEND (4)
108 SEND (4)
109 CACHE
110 YIELD_VALUE (1)
111 RESUME (3)
112 JUMP_BACKWARD_NO_INTERRUPT(5)
113 CLEANUP_THROW
114 END_SEND
115 TO_BOOL
116 CACHE
117 YIELD_VALUE (1)
118 RESUME (3)
119 JUMP_BACKWARD_NO_INTERRUPT(5)
120 CLEANUP_THROW
121 END_SEND
122 POP_TOP
123 JUMP_FORWARD (27)
124 PUSH_EXC_INFO
125 WITH_EXCEPT_START
126 GET_AWAITABLE (2)
127 LOAD_CONST (None)
128 SEND (4)
129 CACHE
130 YIELD_VALUE (1)
131 RESUME (3)
132 JUMP_BACKWARD_NO_INTERRUPT(5)
133 CLEANUP_THROW
134 END_SEND
135 TO_BOOL
117 CACHE
118 CACHE
119 POP_JUMP_IF_TRUE (2)
120 CACHE
121 NOT_TAKEN
122 RERAISE (2)
123 POP_TOP
124 POP_EXCEPT
125 POP_TOP
126 POP_TOP
127 POP_TOP
128 JUMP_FORWARD (48)
129 COPY (3)
130 POP_EXCEPT
131 RERAISE (1)
132 PUSH_EXC_INFO
7 133 LOAD_GLOBAL (12, Exception)
134 CACHE
135 CACHE
136 CACHE
137 CACHE
138 CACHE
139 POP_JUMP_IF_TRUE (2)
138 CHECK_EXC_MATCH
139 POP_JUMP_IF_FALSE (32)
140 CACHE
141 NOT_TAKEN
142 RERAISE (2)
143 POP_TOP
144 POP_EXCEPT
145 POP_TOP
146 POP_TOP
147 JUMP_FORWARD (3)
148 COPY (3)
149 POP_EXCEPT
150 RERAISE (1)
151 JUMP_FORWARD (47)
152 PUSH_EXC_INFO
142 STORE_FAST (1, ex)
7 153 LOAD_GLOBAL (12, Exception)
8 143 LOAD_GLOBAL (4, self)
144 CACHE
145 CACHE
146 CACHE
147 CACHE
148 LOAD_ATTR (15, assertIs, method=true)
149 CACHE
150 CACHE
151 CACHE
152 CACHE
153 CACHE
154 CACHE
155 CACHE
156 CACHE
157 CACHE
158 CHECK_EXC_MATCH
159 POP_JUMP_IF_FALSE (34)
158 LOAD_FAST_LOAD_FAST (ex, stop_exc)
159 CALL (2)
160 CACHE
161 NOT_TAKEN
162 STORE_FAST (1, ex)
161 CACHE
162 CACHE
163 POP_TOP
164 POP_EXCEPT
165 LOAD_CONST (None)
166 STORE_FAST (1, ex)
167 DELETE_FAST (1, ex)
168 JUMP_FORWARD (32)
169 LOAD_CONST (None)
170 STORE_FAST (1, ex)
171 DELETE_FAST (1, ex)
172 RERAISE (1)
173 RERAISE (0)
174 COPY (3)
175 POP_EXCEPT
176 RERAISE (1)
8 163 LOAD_GLOBAL (4, self)
164 CACHE
165 CACHE
166 CACHE
167 CACHE
168 LOAD_ATTR (15, assertIs, method=true)
169 CACHE
170 CACHE
171 CACHE
172 CACHE
173 CACHE
174 CACHE
175 CACHE
176 CACHE
177 CACHE
178 LOAD_FAST (1, ex)
179 LOAD_FAST (0, stop_exc)
180 CALL (2)
10 177 LOAD_GLOBAL (4, self)
178 CACHE
179 CACHE
180 CACHE
181 CACHE
182 CACHE
182 LOAD_ATTR (17, fail, method=true)
183 CACHE
184 POP_TOP
185 JUMP_FORWARD (4)
186 LOAD_CONST (None)
187 STORE_FAST (1, ex)
188 DELETE_FAST (1, ex)
189 RERAISE (1)
190 POP_EXCEPT
191 LOAD_CONST (None)
192 STORE_FAST (1, ex)
193 DELETE_FAST (1, ex)
194 JUMP_FORWARD (28)
195 RERAISE (0)
196 COPY (3)
197 POP_EXCEPT
198 RERAISE (1)
184 CACHE
185 CACHE
186 CACHE
>> 187 CACHE
188 CACHE
189 CACHE
190 CACHE
191 CACHE
192 LOAD_FAST_BORROW (0, stop_exc)
193 FORMAT_SIMPLE
194 LOAD_CONST (" was suppressed")
195 BUILD_STRING (2)
196 CALL (1)
197 CACHE
198 CACHE
199 CACHE
200 POP_TOP
201 NOP
10 199 LOAD_GLOBAL (4, self)
200 CACHE
201 CACHE
202 CACHE
203 CACHE
204 LOAD_ATTR (17, fail, method=true)
205 CACHE
3 202 LOAD_CONST (None)
203 LOAD_CONST (None)
>> 204 LOAD_CONST (None)
205 CALL (3)
206 CACHE
207 CACHE
208 CACHE
209 CACHE
210 CACHE
209 POP_TOP
210 JUMP_BACKWARD (187)
211 CACHE
212 CACHE
213 CACHE
214 LOAD_FAST_BORROW (0, stop_exc)
215 FORMAT_SIMPLE
216 LOAD_CONST (" was suppressed")
217 BUILD_STRING (2)
218 CALL (1)
212 PUSH_EXC_INFO
213 WITH_EXCEPT_START
214 TO_BOOL
215 CACHE
216 CACHE
217 CACHE
218 POP_JUMP_IF_TRUE (2)
219 CACHE
220 CACHE
221 CACHE
220 NOT_TAKEN
221 RERAISE (2)
222 POP_TOP
223 NOP
223 POP_EXCEPT
224 POP_TOP
225 POP_TOP
226 POP_TOP
227 JUMP_BACKWARD (204)
228 CACHE
229 COPY (3)
230 POP_EXCEPT
231 RERAISE (1)
3 224 PUSH_NULL
225 LOAD_CONST (None)
226 LOAD_CONST (None)
227 LOAD_CONST (None)
228 CALL (3)
>> 229 CACHE
230 CACHE
231 CACHE
232 POP_TOP
233 JUMP_FORWARD (18)
234 PUSH_EXC_INFO
235 WITH_EXCEPT_START
236 TO_BOOL
237 CACHE
238 CACHE
239 CACHE
240 POP_JUMP_IF_TRUE (2)
241 CACHE
242 NOT_TAKEN
243 RERAISE (2)
244 POP_TOP
245 POP_EXCEPT
246 POP_TOP
247 POP_TOP
248 JUMP_FORWARD (3)
249 COPY (3)
250 POP_EXCEPT
251 RERAISE (1)
252 JUMP_BACKWARD (229)
253 CACHE
2 232 CALL_INTRINSIC_1 (StopIterationError)
233 RERAISE (1)
2 MAKE_FUNCTION
3 STORE_NAME (0, test)

View File

@@ -113,7 +113,9 @@ impl StringParser {
let name_and_ending = self.skip_bytes(close_idx + 1);
let name = &name_and_ending[..name_and_ending.len() - 1];
unicode_names2::character(name).ok_or_else(|| unreachable!())
rustpython_unicode::data::lookup(name)
.and_then(char::from_u32)
.ok_or_else(|| unreachable!())
}
/// Parse an escaped character, returning the new character.

View File

@@ -54,6 +54,9 @@ pub struct SymbolTable {
/// Whether this type param scope can see the parent class scope
pub can_see_class_scope: bool,
/// Whether this scope contains yield/yield from (is a generator function)
pub is_generator: bool,
/// Whether this comprehension scope should be inlined (PEP 709)
/// True for list/set/dict comprehensions in non-generator expressions
pub comp_inlined: bool,
@@ -89,6 +92,7 @@ impl SymbolTable {
needs_class_closure: false,
needs_classdict: false,
can_see_class_scope: false,
is_generator: false,
comp_inlined: false,
annotation_block: None,
has_conditional_annotations: false,
@@ -292,6 +296,20 @@ fn drop_class_free(symbol_table: &mut SymbolTable, newfree: &mut IndexSet<String
symbol_table.needs_classdict = true;
}
// Classes with function definitions need __classdict__ for PEP 649
// (but not when `from __future__ import annotations` is active)
if !symbol_table.needs_classdict && !symbol_table.future_annotations {
let has_functions = symbol_table.sub_tables.iter().any(|t| {
matches!(
t.typ,
CompilerScope::Function | CompilerScope::AsyncFunction
)
});
if has_functions {
symbol_table.needs_classdict = true;
}
}
// Check if __conditional_annotations__ is in the free variables collected from children
// Remove it from free set - it's handled specially in class scope
if newfree.shift_remove("__conditional_annotations__") {
@@ -299,6 +317,88 @@ fn drop_class_free(symbol_table: &mut SymbolTable, newfree: &mut IndexSet<String
}
}
/// Check if an expression contains an `await` node (shallow, not into nested scopes).
fn expr_contains_await(expr: &ast::Expr) -> bool {
use ast::visitor::Visitor;
struct AwaitFinder(bool);
impl ast::visitor::Visitor<'_> for AwaitFinder {
fn visit_expr(&mut self, expr: &ast::Expr) {
if !self.0 {
if matches!(expr, ast::Expr::Await(_)) {
self.0 = true;
} else {
ast::visitor::walk_expr(self, expr);
}
}
}
}
let mut finder = AwaitFinder(false);
finder.visit_expr(expr);
finder.0
}
/// PEP 709: Merge symbols from an inlined comprehension into the parent scope.
/// Matches symtable.c inline_comprehension().
fn inline_comprehension(
parent_symbols: &mut SymbolMap,
comp: &SymbolTable,
comp_free: &mut IndexSet<String>,
inlined_cells: &mut IndexSet<String>,
parent_type: CompilerScope,
) {
for (name, sub_symbol) in &comp.symbols {
// Skip the .0 parameter
if sub_symbol.flags.contains(SymbolFlags::PARAMETER) {
continue;
}
// Track inlined cells
if sub_symbol.scope == SymbolScope::Cell
|| sub_symbol.flags.contains(SymbolFlags::COMP_CELL)
{
inlined_cells.insert(name.clone());
}
// Handle __class__ in ClassBlock
let scope = if sub_symbol.scope == SymbolScope::Free
&& parent_type == CompilerScope::Class
&& name == "__class__"
{
comp_free.swap_remove(name);
SymbolScope::GlobalImplicit
} else {
sub_symbol.scope
};
if let Some(existing) = parent_symbols.get(name) {
// Name exists in parent
if existing.is_bound() && parent_type != CompilerScope::Class {
// Check if the name is free in any child of the comprehension
let is_free_in_child = comp.sub_tables.iter().any(|child| {
child
.symbols
.get(name)
.is_some_and(|s| s.scope == SymbolScope::Free)
});
if !is_free_in_child {
comp_free.swap_remove(name);
}
}
} else {
// Name doesn't exist in parent, copy from comprehension.
// Reset scope to Unknown so analyze_symbol will resolve it
// in the parent's context.
let mut symbol = sub_symbol.clone();
symbol.scope = if sub_symbol.is_bound() {
SymbolScope::Unknown
} else {
scope
};
parent_symbols.insert(name.clone(), symbol);
}
}
}
type SymbolMap = IndexMap<String, Symbol>;
mod stack {
@@ -392,14 +492,9 @@ impl SymbolTableAnalyzer {
let symbols = core::mem::take(&mut symbol_table.symbols);
let sub_tables = &mut *symbol_table.sub_tables;
// Collect free variables from all child scopes
let mut newfree = IndexSet::default();
let annotation_block = &mut symbol_table.annotation_block;
// PEP 649: Determine class_entry to pass to children
// If current scope is a class with annotation block that can_see_class_scope,
// we need to pass class symbols to the annotation scope
let is_class = symbol_table.typ == CompilerScope::Class;
// Clone class symbols if needed for child scopes with can_see_class_scope
@@ -418,12 +513,16 @@ impl SymbolTableAnalyzer {
None
};
// Collect (child_free, is_inlined) pairs from child scopes.
// We need to process inlined comprehensions after the closure
// when we have access to symbol_table.symbols.
let mut child_frees: Vec<(IndexSet<String>, bool)> = Vec::new();
let mut annotation_free: Option<IndexSet<String>> = None;
let mut info = (symbols, symbol_table.typ);
self.tables.with_append(&mut info, |list| {
let inner_scope = unsafe { &mut *(list as *mut _ as *mut Self) };
// Analyze sub scopes and collect their free variables
for sub_table in sub_tables.iter_mut() {
// Pass class_entry to sub-scopes that can see the class scope
let child_class_entry = if sub_table.can_see_class_scope {
if is_class {
class_symbols_clone.as_ref()
@@ -434,12 +533,10 @@ impl SymbolTableAnalyzer {
None
};
let child_free = inner_scope.analyze_symbol_table(sub_table, child_class_entry)?;
// Propagate child's free variables to this scope
newfree.extend(child_free);
child_frees.push((child_free, sub_table.comp_inlined));
}
// PEP 649: Analyze annotation block if present
if let Some(annotation_table) = annotation_block {
// Pass class symbols to annotation scope if can_see_class_scope
let ann_class_entry = if annotation_table.can_see_class_scope {
if is_class {
class_symbols_clone.as_ref()
@@ -451,59 +548,59 @@ impl SymbolTableAnalyzer {
};
let child_free =
inner_scope.analyze_symbol_table(annotation_table, ann_class_entry)?;
// Propagate annotation's free variables to this scope
newfree.extend(child_free);
annotation_free = Some(child_free);
}
Ok(())
})?;
symbol_table.symbols = info.0;
// PEP 709: Merge symbols from inlined comprehensions into parent scope
// Only merge symbols that are actually bound in the comprehension,
// not references to outer scope variables (Free symbols).
const BOUND_FLAGS: SymbolFlags = SymbolFlags::ASSIGNED
.union(SymbolFlags::PARAMETER)
.union(SymbolFlags::ITER)
.union(SymbolFlags::ASSIGNED_IN_COMPREHENSION);
for sub_table in sub_tables.iter() {
if sub_table.comp_inlined {
for (name, sub_symbol) in &sub_table.symbols {
// Skip the .0 parameter - it's internal to the comprehension
if name == ".0" {
continue;
}
// Only merge symbols that are bound in the comprehension
// Skip Free references to outer scope variables
if !sub_symbol.flags.intersects(BOUND_FLAGS) {
continue;
}
// If the symbol doesn't exist in parent, add it
if !symbol_table.symbols.contains_key(name) {
let mut symbol = sub_symbol.clone();
// Mark as local in parent scope
symbol.scope = SymbolScope::Local;
symbol_table.symbols.insert(name.clone(), symbol);
}
}
// PEP 709: Process inlined comprehensions.
// Merge symbols from inlined comps into parent scope without bail-out.
let mut inlined_cells: IndexSet<String> = IndexSet::default();
let mut newfree = IndexSet::default();
for (idx, (mut child_free, is_inlined)) in child_frees.into_iter().enumerate() {
if is_inlined {
inline_comprehension(
&mut symbol_table.symbols,
&sub_tables[idx],
&mut child_free,
&mut inlined_cells,
symbol_table.typ,
);
}
newfree.extend(child_free);
}
if let Some(ann_free) = annotation_free {
// Propagate annotation-scope free names to this scope so
// implicit class-scope cells (__classdict__/__conditional_annotations__)
// can be materialized by drop_class_free when needed.
newfree.extend(ann_free);
}
let sub_tables = &*symbol_table.sub_tables;
// Analyze symbols in current scope
for symbol in symbol_table.symbols.values_mut() {
self.analyze_symbol(symbol, symbol_table.typ, sub_tables, class_entry)?;
// Collect free variables from this scope
// These will be propagated to the parent scope
if symbol.scope == SymbolScope::Free || symbol.flags.contains(SymbolFlags::FREE_CLASS) {
newfree.insert(symbol.name.clone());
}
}
// PEP 709: Promote LOCAL to CELL and set COMP_CELL for inlined cell vars
for symbol in symbol_table.symbols.values_mut() {
if inlined_cells.contains(&symbol.name) {
if symbol.scope == SymbolScope::Local {
symbol.scope = SymbolScope::Cell;
}
symbol.flags.insert(SymbolFlags::COMP_CELL);
}
}
// Handle class-specific implicit cells
// This removes __class__ and __classdict__ from newfree if present
// and sets the corresponding flags on the symbol table
if symbol_table.typ == CompilerScope::Class {
drop_class_free(symbol_table, &mut newfree);
}
@@ -665,6 +762,12 @@ impl SymbolTableAnalyzer {
if let Some(decl_depth) = decl_depth {
// decl_depth is the number of tables between the current one and
// the one that declared the cell var
// For implicit class scope variables (__classdict__, __conditional_annotations__),
// only propagate free to annotation/type-param scopes, not regular functions.
// Regular method functions don't need these in their freevars.
let is_class_implicit =
name == "__classdict__" || name == "__conditional_annotations__";
for (table, typ) in self.tables.iter_mut().rev().take(decl_depth) {
if let CompilerScope::Class = typ {
if let Some(free_class) = table.get_mut(name) {
@@ -675,10 +778,19 @@ impl SymbolTableAnalyzer {
symbol.scope = SymbolScope::Free;
table.insert(name.to_owned(), symbol);
}
} else if is_class_implicit
&& matches!(
typ,
CompilerScope::Function
| CompilerScope::AsyncFunction
| CompilerScope::Lambda
)
{
// Skip: don't add __classdict__/__conditional_annotations__
// as free vars in regular functions — only annotation/type scopes need them
} else if !table.contains_key(name) {
let mut symbol = Symbol::new(name);
symbol.scope = SymbolScope::Free;
// symbol.is_referenced = true;
table.insert(name.to_owned(), symbol);
}
}
@@ -694,6 +806,11 @@ impl SymbolTableAnalyzer {
st_typ: CompilerScope,
) -> Option<SymbolScope> {
sub_tables.iter().find_map(|st| {
// PEP 709: For inlined comprehensions, check their children
// instead of the comp itself (its symbols are merged into parent).
if st.comp_inlined {
return self.found_in_inner_scope(&st.sub_tables, name, st_typ);
}
let sym = st.symbols.get(name)?;
if sym.scope == SymbolScope::Free || sym.flags.contains(SymbolFlags::FREE_CLASS) {
if st_typ == CompilerScope::Class && name != "__class__" {
@@ -918,6 +1035,7 @@ impl SymbolTableBuilder {
.and_then(|t| t.mangled_names.clone())
.filter(|_| typ != CompilerScope::Class);
let mut table = SymbolTable::new(name.to_owned(), typ, line_number, is_nested);
table.future_annotations = self.future_annotations;
table.mangled_names = inherited_mangled_names;
self.tables.push(table);
// Save parent's varnames and start fresh for the new scope
@@ -1128,20 +1246,30 @@ impl SymbolTableBuilder {
}
fn scan_annotation(&mut self, annotation: &ast::Expr) -> SymbolTableResult {
self.scan_annotation_inner(annotation, false)
}
/// Scan an annotation from an AnnAssign statement (can be conditional)
fn scan_ann_assign_annotation(&mut self, annotation: &ast::Expr) -> SymbolTableResult {
self.scan_annotation_inner(annotation, true)
}
fn scan_annotation_inner(
&mut self,
annotation: &ast::Expr,
is_ann_assign: bool,
) -> SymbolTableResult {
let current_scope = self.tables.last().map(|t| t.typ);
// PEP 649: Check if this is a conditional annotation
// Module-level: always conditional (module may be partially executed)
// Class-level: conditional only when inside if/for/while/etc.
if !self.future_annotations {
// PEP 649: Only AnnAssign annotations can be conditional.
// Function parameter/return annotations are never conditional.
if is_ann_assign && !self.future_annotations {
let is_conditional = matches!(current_scope, Some(CompilerScope::Module))
|| (matches!(current_scope, Some(CompilerScope::Class))
&& self.in_conditional_block);
if is_conditional && !self.tables.last().unwrap().has_conditional_annotations {
self.tables.last_mut().unwrap().has_conditional_annotations = true;
// Register __conditional_annotations__ as both Assigned and Used so that
// it becomes a Cell variable in class scope (children reference it as Free)
self.register_name(
"__conditional_annotations__",
SymbolUsage::Assigned,
@@ -1473,7 +1601,7 @@ impl SymbolTableBuilder {
// sub_tables that cause mismatch in the annotation scope's sub_table index.
let is_simple_name = *simple && matches!(&**target, Expr::Name(_));
if is_simple_name {
self.scan_annotation(annotation)?;
self.scan_ann_assign_annotation(annotation)?;
} else {
// Still validate annotation for forbidden expressions
// (yield, await, named) even for non-simple targets.
@@ -1729,6 +1857,7 @@ impl SymbolTableBuilder {
node_index: _,
range: _,
}) => {
self.tables.last_mut().unwrap().is_generator = true;
if let Some(expression) = value {
self.scan_expression(expression, context)?;
}
@@ -1738,6 +1867,7 @@ impl SymbolTableBuilder {
node_index: _,
range: _,
}) => {
self.tables.last_mut().unwrap().is_generator = true;
self.scan_expression(value, context)?;
}
Expr::UnaryOp(ExprUnaryOp {
@@ -2036,14 +2166,31 @@ impl SymbolTableBuilder {
CompilerScope::Comprehension,
self.line_index_start(range),
);
// Generator expressions need the is_generator flag
self.tables.last_mut().unwrap().is_generator = is_generator;
// PEP 709: inlined comprehensions are not yet implemented in the
// compiler (is_inlined_comprehension_context always returns false),
// so do NOT mark comp_inlined here. Setting it would cause the
// symbol-table analyzer to merge comprehension-local symbols into
// the parent scope, while the compiler still emits a separate code
// object — leading to the merged symbols being missing from the
// comprehension's own symbol table lookup.
// PEP 709: Mark non-generator comprehensions for inlining.
// Only in function-like scopes for now. Module/class scope inlining
// needs more work (Cell name resolution, __class__ handling).
// Also excluded: generator expressions, async comprehensions,
// and annotation scopes nested in classes (can_see_class_scope).
let element_has_await = expr_contains_await(elt1) || elt2.is_some_and(expr_contains_await);
if !is_generator && !has_async_gen && !element_has_await {
let parent = self.tables.iter().rev().nth(1);
let parent_can_see_class = parent.is_some_and(|t| t.can_see_class_scope);
let parent_is_func = parent.is_some_and(|t| {
matches!(
t.typ,
CompilerScope::Function
| CompilerScope::AsyncFunction
| CompilerScope::Lambda
| CompilerScope::Comprehension
)
});
if !parent_can_see_class && parent_is_func {
self.tables.last_mut().unwrap().comp_inlined = true;
}
}
// Register the passed argument to the generator function as the name ".0"
self.register_name(".0", SymbolUsage::Parameter, range)?;

View File

@@ -16,6 +16,7 @@ wasm_js = ["getrandom/wasm_js"]
[dependencies]
rustpython-literal = { workspace = true }
rustpython-unicode = { workspace = true, default-features = false }
rustpython-wtf8 = { workspace = true }
ascii = { workspace = true }
@@ -29,7 +30,6 @@ malachite-q = { workspace = true }
malachite-base = { workspace = true }
num-traits = { workspace = true }
parking_lot = { workspace = true, optional = true }
unicode_names2 = { workspace = true }
radium = { workspace = true }
lock_api = "0.4"

View File

@@ -414,7 +414,7 @@ pub mod errors {
let mut out = String::with_capacity(num_chars * 4);
for c in err_str.code_points() {
let c_u32 = c.to_u32();
if let Some(c_name) = c.to_char().and_then(unicode_names2::name) {
if let Some(c_name) = rustpython_unicode::data::name(c_u32) {
write!(out, "\\N{{{c_name}}}").unwrap();
} else if c_u32 >= 0x10000 {
write!(out, "\\U{c_u32:08x}").unwrap();

View File

@@ -19,7 +19,7 @@ itertools = { workspace = true }
malachite-bigint = { workspace = true }
num-complex = { workspace = true }
lz4_flex = "0.12"
lz4_flex = "0.13"
[lints]
workspace = true

View File

@@ -3,7 +3,7 @@
use crate::{
marshal::MarshalError,
varint::{read_varint, read_varint_with_start, write_varint, write_varint_with_start},
varint::{read_varint, read_varint_with_start, write_varint_be, write_varint_with_start},
{OneIndexed, SourceLocation},
};
use alloc::{borrow::ToOwned, boxed::Box, collections::BTreeSet, fmt, string::String, vec::Vec};
@@ -27,7 +27,7 @@ pub use crate::bytecode::{
BinaryOperator, BuildSliceArgCount, CommonConstant, ComparisonOperator, ConvertValueOparg,
IntrinsicFunction1, IntrinsicFunction2, Invert, Label, LoadAttr, LoadSuperAttr,
MakeFunctionFlag, MakeFunctionFlags, NameIdx, OpArg, OpArgByte, OpArgState, OpArgType,
RaiseKind, ResumeType, SpecialMethod, UnpackExArgs,
RaiseKind, SpecialMethod, UnpackExArgs,
},
};
@@ -71,9 +71,9 @@ pub fn encode_exception_table(entries: &[ExceptionTableEntry]) -> alloc::boxed::
let depth_lasti = ((entry.depth as u32) << 1) | (entry.push_lasti as u32);
write_varint_with_start(&mut data, entry.start);
write_varint(&mut data, size);
write_varint(&mut data, entry.target);
write_varint(&mut data, depth_lasti);
write_varint_be(&mut data, size);
write_varint_be(&mut data, entry.target);
write_varint_be(&mut data, depth_lasti);
}
data.into_boxed_slice()
}
@@ -135,6 +135,72 @@ pub fn decode_exception_table(table: &[u8]) -> Vec<ExceptionTableEntry> {
entries
}
/// Parse linetable to build a boolean mask indicating which code units
/// have NO_LOCATION (line == -1). Returns a Vec<bool> of length `num_units`.
pub fn build_no_location_mask(linetable: &[u8], num_units: usize) -> Vec<bool> {
let mut mask = Vec::new();
mask.resize(num_units, false);
let mut pos = 0;
let mut unit_idx = 0;
while pos < linetable.len() && unit_idx < num_units {
let header = linetable[pos];
pos += 1;
let code = (header >> 3) & 0xf;
let length = ((header & 7) + 1) as usize;
let is_no_location = code == PyCodeLocationInfoKind::None as u8;
// Skip payload bytes based on location kind
match code {
0..=9 => pos += 1, // Short forms: 1 byte payload
10..=12 => pos += 2, // OneLine forms: 2 bytes payload
13 => {
// NoColumns: signed varint (line delta)
while pos < linetable.len() {
let b = linetable[pos];
pos += 1;
if b & 0x40 == 0 {
break;
}
}
}
14 => {
// Long form: signed varint (line delta) + 3 unsigned varints
// line_delta
while pos < linetable.len() {
let b = linetable[pos];
pos += 1;
if b & 0x40 == 0 {
break;
}
}
// end_line_delta, col+1, end_col+1
for _ in 0..3 {
while pos < linetable.len() {
let b = linetable[pos];
pos += 1;
if b & 0x40 == 0 {
break;
}
}
}
}
15 => {} // None: no payload
_ => {}
}
for _ in 0..length {
if unit_idx < num_units {
mask[unit_idx] = is_no_location;
unit_idx += 1;
}
}
}
mask
}
/// CPython 3.11+ linetable location info codes
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
#[repr(u8)]
@@ -204,7 +270,7 @@ impl PyCodeLocationInfoKind {
}
}
pub trait Constant: Sized {
pub trait Constant: Sized + Clone {
type Name: AsRef<str>;
/// Transforms the given Constant to a BorrowedConstant
@@ -226,6 +292,8 @@ impl Constant for ConstantData {
Self::Bytes { value } => Bytes { value },
Self::Code { code } => Code { code },
Self::Tuple { elements } => Tuple { elements },
Self::Slice { elements } => Slice { elements },
Self::Frozenset { elements } => Frozenset { elements },
Self::None => None,
Self::Ellipsis => Ellipsis,
}
@@ -334,6 +402,12 @@ impl<T> IndexMut<oparg::VarNum> for [T] {
}
}
/// Per-slot kind flags for localsplus (co_localspluskinds).
pub const CO_FAST_HIDDEN: u8 = 0x10;
pub const CO_FAST_LOCAL: u8 = 0x20;
pub const CO_FAST_CELL: u8 = 0x40;
pub const CO_FAST_FREE: u8 = 0x80;
/// Primary container of a single code object. Each python function has
/// a code object. Also a module has a code object.
#[derive(Clone)]
@@ -352,12 +426,14 @@ pub struct CodeObject<C: Constant = ConstantData> {
pub obj_name: C::Name,
/// Qualified name of the object (like CPython's co_qualname)
pub qualname: C::Name,
pub cell2arg: Option<Box<[i32]>>,
pub constants: Constants<C>,
pub names: Box<[C::Name]>,
pub varnames: Box<[C::Name]>,
pub cellvars: Box<[C::Name]>,
pub freevars: Box<[C::Name]>,
/// Per-slot kind flags: CO_FAST_LOCAL, CO_FAST_CELL, CO_FAST_FREE, CO_FAST_HIDDEN.
/// Length = nlocalsplus (nlocals + ncells + nfrees).
pub localspluskinds: Box<[u8]>,
/// Line number table (CPython 3.11+ format)
pub linetable: Box<[u8]>,
/// Exception handling table
@@ -559,6 +635,14 @@ impl Deref for CodeUnits {
}
impl CodeUnits {
/// Disable adaptive specialization by setting all counters to unreachable.
/// Used for CPython-compiled bytecode where specialization may not be safe.
pub fn disable_specialization(&self) {
for counter in self.adaptive_counters.iter() {
counter.store(UNREACHABLE_BACKOFF, Ordering::Relaxed);
}
}
/// Replace the opcode at `index` in-place without changing the arg byte.
/// Uses atomic Release store to ensure prior cache writes are visible
/// to threads that subsequently read the new opcode with Acquire.
@@ -767,14 +851,37 @@ impl CodeUnits {
/// ```
#[derive(Debug, Clone)]
pub enum ConstantData {
Tuple { elements: Vec<ConstantData> },
Integer { value: BigInt },
Float { value: f64 },
Complex { value: Complex64 },
Boolean { value: bool },
Str { value: Wtf8Buf },
Bytes { value: Vec<u8> },
Code { code: Box<CodeObject> },
Tuple {
elements: Vec<ConstantData>,
},
Integer {
value: BigInt,
},
Float {
value: f64,
},
Complex {
value: Complex64,
},
Boolean {
value: bool,
},
Str {
value: Wtf8Buf,
},
Bytes {
value: Vec<u8>,
},
Code {
code: Box<CodeObject>,
},
/// Constant slice(start, stop, step)
Slice {
elements: Box<[ConstantData; 3]>,
},
Frozenset {
elements: Vec<ConstantData>,
},
None,
Ellipsis,
}
@@ -796,6 +903,8 @@ impl PartialEq for ConstantData {
(Bytes { value: a }, Bytes { value: b }) => a == b,
(Code { code: a }, Code { code: b }) => core::ptr::eq(a.as_ref(), b.as_ref()),
(Tuple { elements: a }, Tuple { elements: b }) => a == b,
(Slice { elements: a }, Slice { elements: b }) => a == b,
(Frozenset { elements: a }, Frozenset { elements: b }) => a == b,
(None, None) => true,
(Ellipsis, Ellipsis) => true,
_ => false,
@@ -822,6 +931,8 @@ impl hash::Hash for ConstantData {
Bytes { value } => value.hash(state),
Code { code } => core::ptr::hash(code.as_ref(), state),
Tuple { elements } => elements.hash(state),
Slice { elements } => elements.hash(state),
Frozenset { elements } => elements.hash(state),
None => {}
Ellipsis => {}
}
@@ -838,6 +949,8 @@ pub enum BorrowedConstant<'a, C: Constant> {
Bytes { value: &'a [u8] },
Code { code: &'a CodeObject<C> },
Tuple { elements: &'a [C] },
Slice { elements: &'a [C; 3] },
Frozenset { elements: &'a [C] },
None,
Ellipsis,
}
@@ -875,6 +988,28 @@ impl<C: Constant> BorrowedConstant<'_, C> {
}
write!(f, ")")
}
BorrowedConstant::Slice { elements } => {
write!(f, "slice(")?;
elements[0].borrow_constant().fmt_display(f)?;
write!(f, ", ")?;
elements[1].borrow_constant().fmt_display(f)?;
write!(f, ", ")?;
elements[2].borrow_constant().fmt_display(f)?;
write!(f, ")")
}
BorrowedConstant::Frozenset { elements } => {
write!(f, "frozenset({{")?;
let mut first = true;
for c in *elements {
if first {
first = false
} else {
write!(f, ", ")?;
}
c.borrow_constant().fmt_display(f)?;
}
write!(f, "}})")
}
BorrowedConstant::None => write!(f, "None"),
BorrowedConstant::Ellipsis => write!(f, "..."),
}
@@ -905,6 +1040,15 @@ impl<C: Constant> BorrowedConstant<'_, C> {
.map(|c| c.borrow_constant().to_owned())
.collect(),
},
BorrowedConstant::Slice { elements } => Slice {
elements: Box::new(elements.each_ref().map(|c| c.borrow_constant().to_owned())),
},
BorrowedConstant::Frozenset { elements } => Frozenset {
elements: elements
.iter()
.map(|c| c.borrow_constant().to_owned())
.collect(),
},
BorrowedConstant::None => None,
BorrowedConstant::Ellipsis => Ellipsis,
}
@@ -1025,7 +1169,7 @@ impl<C: Constant> CodeObject<C> {
}
// arrow and offset
let arrow = if label_targets.contains(&Label::new(offset as u32)) {
let arrow = if label_targets.contains(&Label::from_u32(offset as u32)) {
">>"
} else {
" "
@@ -1080,7 +1224,7 @@ impl<C: Constant> CodeObject<C> {
kwonlyarg_count: self.kwonlyarg_count,
first_line_number: self.first_line_number,
max_stackdepth: self.max_stackdepth,
cell2arg: self.cell2arg,
localspluskinds: self.localspluskinds,
linetable: self.linetable,
exceptiontable: self.exceptiontable,
}
@@ -1112,7 +1256,7 @@ impl<C: Constant> CodeObject<C> {
kwonlyarg_count: self.kwonlyarg_count,
first_line_number: self.first_line_number,
max_stackdepth: self.max_stackdepth,
cell2arg: self.cell2arg.clone(),
localspluskinds: self.localspluskinds.clone(),
linetable: self.linetable.clone(),
exceptiontable: self.exceptiontable.clone(),
}
@@ -1141,7 +1285,8 @@ pub trait InstrDisplayContext {
fn get_varname(&self, var_num: oparg::VarNum) -> &str;
fn get_cell_name(&self, i: usize) -> &str;
/// Get name for a localsplus index (used by DEREF instructions).
fn get_localsplus_name(&self, var_num: oparg::VarNum) -> &str;
}
impl<C: Constant> InstrDisplayContext for CodeObject<C> {
@@ -1159,11 +1304,18 @@ impl<C: Constant> InstrDisplayContext for CodeObject<C> {
self.varnames[var_num].as_ref()
}
fn get_cell_name(&self, i: usize) -> &str {
self.cellvars
.get(i)
.unwrap_or_else(|| &self.freevars[i - self.cellvars.len()])
.as_ref()
fn get_localsplus_name(&self, var_num: oparg::VarNum) -> &str {
let idx = var_num.as_usize();
let nlocals = self.varnames.len();
if idx < nlocals {
self.varnames[idx].as_ref()
} else {
let cell_idx = idx - nlocals;
self.cellvars
.get(cell_idx)
.unwrap_or_else(|| &self.freevars[cell_idx - self.cellvars.len()])
.as_ref()
}
}
}

View File

@@ -130,7 +130,7 @@ pub enum Instruction {
namei: Arg<NameIdx>,
} = 61,
DeleteDeref {
i: Arg<NameIdx>,
i: Arg<oparg::VarNum>,
} = 62,
DeleteFast {
var_num: Arg<oparg::VarNum>,
@@ -189,7 +189,7 @@ pub enum Instruction {
consti: Arg<oparg::ConstIdx>,
} = 82,
LoadDeref {
i: Arg<NameIdx>,
i: Arg<oparg::VarNum>,
} = 83,
LoadFast {
var_num: Arg<oparg::VarNum>,
@@ -210,7 +210,7 @@ pub enum Instruction {
var_nums: Arg<oparg::VarNums>,
} = 89,
LoadFromDictOrDeref {
i: Arg<NameIdx>,
i: Arg<oparg::VarNum>,
} = 90,
LoadFromDictOrGlobals {
i: Arg<NameIdx>,
@@ -231,7 +231,7 @@ pub enum Instruction {
namei: Arg<LoadSuperAttr>,
} = 96,
MakeCell {
i: Arg<NameIdx>,
i: Arg<oparg::VarNum>,
} = 97,
MapAdd {
i: Arg<u32>,
@@ -273,7 +273,7 @@ pub enum Instruction {
namei: Arg<NameIdx>,
} = 110,
StoreDeref {
i: Arg<NameIdx>,
i: Arg<oparg::VarNum>,
} = 111,
StoreFast {
var_num: Arg<oparg::VarNum>,
@@ -304,7 +304,7 @@ pub enum Instruction {
} = 120,
// CPython 3.14 RESUME (128)
Resume {
context: Arg<oparg::ResumeType>,
context: Arg<oparg::ResumeContext>,
} = 128,
// CPython 3.14 specialized opcodes (129-211)
BinaryOpAddFloat = 129, // Placeholder
@@ -1020,7 +1020,7 @@ impl InstructionMetadata for Instruction {
Self::LoadLocals => (1, 0),
Self::LoadName { .. } => (1, 0),
Self::LoadSmallInt { .. } => (1, 0),
Self::LoadSpecial { .. } => (1, 1),
Self::LoadSpecial { .. } => (2, 1),
Self::LoadSuperAttr { .. } => (1 + (oparg & 1), 3),
Self::LoadSuperAttrAttr => (1, 3),
Self::LoadSuperAttrMethod => (2, 3),
@@ -1085,7 +1085,7 @@ impl InstructionMetadata for Instruction {
Self::UnpackSequenceList => (oparg, 1),
Self::UnpackSequenceTuple => (oparg, 1),
Self::UnpackSequenceTwoTuple => (2, 1),
Self::WithExceptStart => (6, 5),
Self::WithExceptStart => (7, 6),
Self::YieldValue { .. } => (1, 1),
};
@@ -1128,7 +1128,7 @@ impl InstructionMetadata for Instruction {
let varname = |var_num: oparg::VarNum| ctx.get_varname(var_num);
let name = |i: u32| ctx.get_name(i as usize);
let cell_name = |i: u32| ctx.get_cell_name(i as usize);
let cell_name = |i: oparg::VarNum| ctx.get_localsplus_name(i);
let fmt_const = |op: &str,
arg: OpArg,

View File

@@ -276,48 +276,6 @@ impl fmt::Display for ConvertValueOparg {
}
}
/// Resume type for the RESUME instruction
#[derive(Copy, Clone, Debug, Hash, PartialEq, Eq)]
pub enum ResumeType {
AtFuncStart,
AfterYield,
AfterYieldFrom,
AfterAwait,
Other(u32),
}
impl From<u32> for ResumeType {
fn from(value: u32) -> Self {
match value {
0 => Self::AtFuncStart,
1 => Self::AfterYield,
2 => Self::AfterYieldFrom,
3 => Self::AfterAwait,
_ => Self::Other(value),
}
}
}
impl From<ResumeType> for u32 {
fn from(typ: ResumeType) -> Self {
match typ {
ResumeType::AtFuncStart => 0,
ResumeType::AfterYield => 1,
ResumeType::AfterYieldFrom => 2,
ResumeType::AfterAwait => 3,
ResumeType::Other(v) => v,
}
}
}
impl core::fmt::Display for ResumeType {
fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
u32::from(*self).fmt(f)
}
}
impl OpArgType for ResumeType {}
pub type NameIdx = u32;
impl OpArgType for u32 {}
@@ -382,16 +340,20 @@ oparg_enum!(
);
bitflagset::bitflag! {
/// `SET_FUNCTION_ATTRIBUTE` flags.
/// Bitmask: Defaults=0x01, KwOnly=0x02, Annotations=0x04,
/// Closure=0x08, TypeParams=0x10, Annotate=0x20.
/// Stored as bit position (0-5) by `bitflag!` macro.
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
#[repr(u8)]
pub enum MakeFunctionFlag {
Closure = 0,
Annotations = 1,
KwOnlyDefaults = 2,
Defaults = 3,
TypeParams = 4,
Defaults = 0,
KwOnlyDefaults = 1,
Annotations = 2,
Closure = 3,
/// PEP 649: __annotate__ function closure (instead of __annotations__ dict)
Annotate = 5,
Annotate = 4,
TypeParams = 5,
}
}
@@ -403,33 +365,86 @@ bitflagset::bitflagset! {
impl TryFrom<u32> for MakeFunctionFlag {
type Error = MarshalError;
/// Decode from CPython-compatible power-of-two value
fn try_from(value: u32) -> Result<Self, Self::Error> {
Self::try_from(value as u8).map_err(|_| MarshalError::InvalidBytecode)
match value {
0x01 => Ok(Self::Defaults),
0x02 => Ok(Self::KwOnlyDefaults),
0x04 => Ok(Self::Annotations),
0x08 => Ok(Self::Closure),
0x10 => Ok(Self::Annotate),
0x20 => Ok(Self::TypeParams),
_ => Err(MarshalError::InvalidBytecode),
}
}
}
impl From<MakeFunctionFlag> for u32 {
/// Encode as CPython-compatible power-of-two value
fn from(flag: MakeFunctionFlag) -> Self {
flag as u32
1u32 << (flag as u32)
}
}
impl OpArgType for MakeFunctionFlag {}
oparg_enum!(
/// The possible comparison operators.
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
pub enum ComparisonOperator {
// be intentional with bits so that we can do eval_ord with just a bitwise and
// bits: | Equal | Greater | Less |
Less = 0b001,
Greater = 0b010,
NotEqual = 0b011,
Equal = 0b100,
LessOrEqual = 0b101,
GreaterOrEqual = 0b110,
/// `COMPARE_OP` arg is `(cmp_index << 5) | mask`. Only the upper
/// 3 bits identify the comparison; the lower 5 bits are an inline
/// cache mask for adaptive specialization.
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
pub enum ComparisonOperator {
Less,
LessOrEqual,
Equal,
NotEqual,
Greater,
GreaterOrEqual,
}
impl TryFrom<u8> for ComparisonOperator {
type Error = MarshalError;
fn try_from(value: u8) -> Result<Self, Self::Error> {
Self::try_from(value as u32)
}
);
}
impl TryFrom<u32> for ComparisonOperator {
type Error = MarshalError;
/// Decode from `COMPARE_OP` arg: `(cmp_index << 5) | mask`.
fn try_from(value: u32) -> Result<Self, Self::Error> {
match value >> 5 {
0 => Ok(Self::Less),
1 => Ok(Self::LessOrEqual),
2 => Ok(Self::Equal),
3 => Ok(Self::NotEqual),
4 => Ok(Self::Greater),
5 => Ok(Self::GreaterOrEqual),
_ => Err(MarshalError::InvalidBytecode),
}
}
}
impl From<ComparisonOperator> for u8 {
/// Encode as `cmp_index << 5` (mask bits zero).
fn from(value: ComparisonOperator) -> Self {
match value {
ComparisonOperator::Less => 0,
ComparisonOperator::LessOrEqual => 1 << 5,
ComparisonOperator::Equal => 2 << 5,
ComparisonOperator::NotEqual => 3 << 5,
ComparisonOperator::Greater => 4 << 5,
ComparisonOperator::GreaterOrEqual => 5 << 5,
}
}
}
impl From<ComparisonOperator> for u32 {
fn from(value: ComparisonOperator) -> Self {
Self::from(u8::from(value))
}
}
impl OpArgType for ComparisonOperator {}
oparg_enum!(
/// The possible Binary operators
@@ -630,6 +645,10 @@ oparg_enum!(
BuiltinAll = 3,
/// Built-in `any` function
BuiltinAny = 4,
/// Built-in `list` type
BuiltinList = 5,
/// Built-in `set` type
BuiltinSet = 6,
}
);
@@ -641,6 +660,8 @@ impl fmt::Display for CommonConstant {
Self::BuiltinTuple => "tuple",
Self::BuiltinAll => "all",
Self::BuiltinAny => "any",
Self::BuiltinList => "list",
Self::BuiltinSet => "set",
};
write!(f, "{name}")
}
@@ -699,14 +720,8 @@ macro_rules! newtype_oparg {
impl $name {
/// Creates a new [`$name`] instance.
#[must_use]
pub const fn new(value: u32) -> Self {
Self(value)
}
/// Alias to [`$name::new`].
#[must_use]
pub const fn from_u32(value: u32) -> Self {
Self::new(value)
Self(value)
}
/// Returns the oparg as a `u32` value.
@@ -786,15 +801,119 @@ newtype_oparg!(
pub struct Label(u32)
);
newtype_oparg!(
/// Context for [`Instruction::Resume`].
///
/// The oparg consists of two parts:
/// 1. [`ResumeContext::location`]: Indicates where the instruction occurs.
/// 2. [`ResumeContext::is_exception_depth1`]: Is the instruction is at except-depth 1.
#[derive(Clone, Copy)]
#[repr(transparent)]
pub struct ResumeContext(u32)
);
impl ResumeContext {
/// [CPython `RESUME_OPARG_LOCATION_MASK`](https://github.com/python/cpython/blob/v3.14.3/Include/internal/pycore_opcode_utils.h#L84)
pub const LOCATION_MASK: u32 = 0x3;
/// [CPython `RESUME_OPARG_DEPTH1_MASK`](https://github.com/python/cpython/blob/v3.14.3/Include/internal/pycore_opcode_utils.h#L85)
pub const DEPTH1_MASK: u32 = 0x4;
#[must_use]
pub const fn new(location: ResumeLocation, is_exception_depth1: bool) -> Self {
let value = if is_exception_depth1 {
Self::DEPTH1_MASK
} else {
0
};
Self::from_u32(location.as_u32() | value)
}
/// Resume location is determined by [`Self::LOCATION_MASK`].
#[must_use]
pub fn location(&self) -> ResumeLocation {
// SAFETY: The mask should return a value that is in range.
unsafe { ResumeLocation::try_from(self.as_u32() & Self::LOCATION_MASK).unwrap_unchecked() }
}
/// True if the bit at [`Self::DEPTH1_MASK`] is on.
#[must_use]
pub const fn is_exception_depth1(&self) -> bool {
(self.as_u32() & Self::DEPTH1_MASK) != 0
}
}
#[derive(Copy, Clone)]
pub enum ResumeLocation {
/// At the start of a function, which is neither a generator, coroutine nor an async generator.
AtFuncStart,
/// After a `yield` expression.
AfterYield,
/// After a `yield from` expression.
AfterYieldFrom,
/// After an `await` expression.
AfterAwait,
}
impl From<ResumeLocation> for ResumeContext {
fn from(location: ResumeLocation) -> Self {
Self::new(location, false)
}
}
impl TryFrom<u32> for ResumeLocation {
type Error = MarshalError;
fn try_from(value: u32) -> Result<Self, Self::Error> {
Ok(match value {
0 => Self::AtFuncStart,
1 => Self::AfterYield,
2 => Self::AfterYieldFrom,
3 => Self::AfterAwait,
_ => return Err(Self::Error::InvalidBytecode),
})
}
}
impl ResumeLocation {
#[must_use]
pub const fn as_u8(&self) -> u8 {
match self {
Self::AtFuncStart => 0,
Self::AfterYield => 1,
Self::AfterYieldFrom => 2,
Self::AfterAwait => 3,
}
}
#[must_use]
pub const fn as_u32(&self) -> u32 {
self.as_u8() as u32
}
}
impl From<ResumeLocation> for u8 {
fn from(location: ResumeLocation) -> Self {
location.as_u8()
}
}
impl From<ResumeLocation> for u32 {
fn from(location: ResumeLocation) -> Self {
location.as_u32()
}
}
impl VarNums {
#[must_use]
pub const fn idx_1(self) -> VarNum {
VarNum::new(self.0 >> 4)
VarNum::from_u32(self.0 >> 4)
}
#[must_use]
pub const fn idx_2(self) -> VarNum {
VarNum::new(self.0 & 15)
VarNum::from_u32(self.0 & 15)
}
#[must_use]
@@ -805,8 +924,8 @@ impl VarNums {
impl LoadAttr {
#[must_use]
pub fn builder() -> LoadAttrBuilder {
LoadAttrBuilder::default()
pub const fn new(name_idx: u32, is_method: bool) -> Self {
Self::from_u32((name_idx << 1) | (is_method as u32))
}
#[must_use]
@@ -820,36 +939,10 @@ impl LoadAttr {
}
}
#[derive(Clone, Copy, Default)]
pub struct LoadAttrBuilder {
name_idx: u32,
is_method: bool,
}
impl LoadAttrBuilder {
#[must_use]
pub const fn build(self) -> LoadAttr {
let value = (self.name_idx << 1) | (self.is_method as u32);
LoadAttr::new(value)
}
#[must_use]
pub const fn name_idx(mut self, value: u32) -> Self {
self.name_idx = value;
self
}
#[must_use]
pub const fn is_method(mut self, value: bool) -> Self {
self.is_method = value;
self
}
}
impl LoadSuperAttr {
#[must_use]
pub fn builder() -> LoadSuperAttrBuilder {
LoadSuperAttrBuilder::default()
pub const fn new(name_idx: u32, is_load_method: bool, has_class: bool) -> Self {
Self::from_u32((name_idx << 2) | (is_load_method as u32) | ((has_class as u32) << 1))
}
#[must_use]
@@ -867,43 +960,3 @@ impl LoadSuperAttr {
(self.0 & 2) == 2
}
}
#[derive(Clone, Copy, Default)]
pub struct LoadSuperAttrBuilder {
name_idx: u32,
is_load_method: bool,
has_class: bool,
}
impl LoadSuperAttrBuilder {
#[must_use]
pub const fn build(self) -> LoadSuperAttr {
let value =
(self.name_idx << 2) | ((self.has_class as u32) << 1) | (self.is_load_method as u32);
LoadSuperAttr::new(value)
}
#[must_use]
pub const fn name_idx(mut self, value: u32) -> Self {
self.name_idx = value;
self
}
#[must_use]
pub const fn is_load_method(mut self, value: bool) -> Self {
self.is_load_method = value;
self
}
#[must_use]
pub const fn has_class(mut self, value: bool) -> Self {
self.has_class = value;
self
}
}
impl From<LoadSuperAttrBuilder> for LoadSuperAttr {
fn from(builder: LoadSuperAttrBuilder) -> Self {
builder.build()
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,12 +1,14 @@
//! Variable-length integer encoding utilities.
//!
//! Uses 6-bit chunks with a continuation bit (0x40) to encode integers.
//! Used for exception tables and line number tables.
//! Two encodings are used:
//! - **Little-endian** (low bits first): linetable
//! - **Big-endian** (high bits first): exception tables
//!
//! Both use 6-bit chunks with 0x40 as the continuation bit.
use alloc::vec::Vec;
/// Write a variable-length unsigned integer using 6-bit chunks.
/// Returns the number of bytes written.
/// Write a little-endian varint (used by linetable).
#[inline]
pub fn write_varint(buf: &mut Vec<u8>, mut val: u32) -> usize {
let start_len = buf.len();
@@ -18,12 +20,10 @@ pub fn write_varint(buf: &mut Vec<u8>, mut val: u32) -> usize {
buf.len() - start_len
}
/// Write a variable-length signed integer.
/// Returns the number of bytes written.
/// Write a little-endian signed varint.
#[inline]
pub fn write_signed_varint(buf: &mut Vec<u8>, val: i32) -> usize {
let uval = if val < 0 {
// (0 - val as u32) handles INT_MIN correctly
((0u32.wrapping_sub(val as u32)) << 1) | 1
} else {
(val as u32) << 1
@@ -31,70 +31,72 @@ pub fn write_signed_varint(buf: &mut Vec<u8>, val: i32) -> usize {
write_varint(buf, uval)
}
/// Write a variable-length unsigned integer with a start marker (0x80 bit).
/// Used for exception table entries where each entry starts with the marker.
/// Write a big-endian varint (used by exception tables).
pub fn write_varint_be(buf: &mut Vec<u8>, val: u32) -> usize {
let start_len = buf.len();
if val >= 1 << 30 {
buf.push(0x40 | ((val >> 30) & 0x3f) as u8);
}
if val >= 1 << 24 {
buf.push(0x40 | ((val >> 24) & 0x3f) as u8);
}
if val >= 1 << 18 {
buf.push(0x40 | ((val >> 18) & 0x3f) as u8);
}
if val >= 1 << 12 {
buf.push(0x40 | ((val >> 12) & 0x3f) as u8);
}
if val >= 1 << 6 {
buf.push(0x40 | ((val >> 6) & 0x3f) as u8);
}
buf.push((val & 0x3f) as u8);
buf.len() - start_len
}
/// Write a big-endian varint with the start marker (0x80) on the first byte.
pub fn write_varint_with_start(data: &mut Vec<u8>, val: u32) {
let start_pos = data.len();
write_varint(data, val);
// Set start bit on first byte
write_varint_be(data, val);
if let Some(first) = data.get_mut(start_pos) {
*first |= 0x80;
}
}
/// Read a variable-length unsigned integer that starts with a start marker (0x80 bit).
/// Returns None if not at a valid start byte or end of data.
/// Read a big-endian varint with start marker (0x80).
pub fn read_varint_with_start(data: &[u8], pos: &mut usize) -> Option<u32> {
if *pos >= data.len() {
return None;
}
let first = data[*pos];
if first & 0x80 == 0 {
return None; // Not a start byte
return None;
}
*pos += 1;
let mut val = (first & 0x3f) as u32;
let mut shift = 6;
let mut has_continuation = first & 0x40 != 0;
while has_continuation && *pos < data.len() {
let byte = data[*pos];
if byte & 0x80 != 0 {
break; // Next entry start
}
let mut cont = first & 0x40 != 0;
while cont && *pos < data.len() {
let b = data[*pos];
*pos += 1;
val |= ((byte & 0x3f) as u32) << shift;
shift += 6;
has_continuation = byte & 0x40 != 0;
val = (val << 6) | (b & 0x3f) as u32;
cont = b & 0x40 != 0;
}
Some(val)
}
/// Read a variable-length unsigned integer.
/// Returns None if end of data or malformed.
/// Read a big-endian varint (no start marker).
pub fn read_varint(data: &[u8], pos: &mut usize) -> Option<u32> {
if *pos >= data.len() {
return None;
}
let mut val = 0u32;
let mut shift = 0;
loop {
if *pos >= data.len() {
return None;
}
let byte = data[*pos];
if byte & 0x80 != 0 && shift > 0 {
break; // Next entry start
}
let first = data[*pos];
*pos += 1;
let mut val = (first & 0x3f) as u32;
let mut cont = first & 0x40 != 0;
while cont && *pos < data.len() {
let b = data[*pos];
*pos += 1;
val |= ((byte & 0x3f) as u32) << shift;
shift += 6;
if byte & 0x40 == 0 {
break;
}
val = (val << 6) | (b & 0x3f) as u32;
cont = b & 0x40 != 0;
}
Some(val)
}
@@ -104,37 +106,39 @@ mod tests {
use super::*;
#[test]
fn test_write_read_varint() {
fn test_le_varint_roundtrip() {
// Little-endian is only used internally in linetable,
// no read function needed outside of linetable parsing.
let mut buf = Vec::new();
write_varint(&mut buf, 0);
write_varint(&mut buf, 63);
write_varint(&mut buf, 64);
write_varint(&mut buf, 4095);
// Values: 0, 63, 64, 4095
assert_eq!(buf.len(), 1 + 1 + 2 + 2);
}
#[test]
fn test_write_read_signed_varint() {
let mut buf = Vec::new();
write_signed_varint(&mut buf, 0);
write_signed_varint(&mut buf, 1);
write_signed_varint(&mut buf, -1);
write_signed_varint(&mut buf, i32::MIN);
assert!(!buf.is_empty());
fn test_be_varint_roundtrip() {
for &val in &[0u32, 1, 63, 64, 127, 128, 4095, 4096, 1_000_000] {
let mut buf = Vec::new();
write_varint_be(&mut buf, val);
let mut pos = 0;
assert_eq!(read_varint(&buf, &mut pos), Some(val), "val={val}");
assert_eq!(pos, buf.len());
}
}
#[test]
fn test_varint_with_start() {
fn test_be_varint_with_start() {
let mut buf = Vec::new();
write_varint_with_start(&mut buf, 42);
write_varint_with_start(&mut buf, 100);
write_varint_with_start(&mut buf, 71);
let mut pos = 0;
assert_eq!(read_varint_with_start(&buf, &mut pos), Some(42));
assert_eq!(read_varint_with_start(&buf, &mut pos), Some(100));
assert_eq!(read_varint_with_start(&buf, &mut pos), Some(71));
assert_eq!(read_varint_with_start(&buf, &mut pos), None);
}
}

View File

@@ -1,7 +1,7 @@
[package]
name = "rustpython-compiler-source"
description = "(DEPRECATED) RustPython Source and Index"
version = "0.5.0+deprecated"
version = "0.4.1+deprecated"
authors.workspace = true
edition.workspace = true
rust-version.workspace = true

View File

@@ -1,5 +1,6 @@
[package]
name = "rustpython-doc"
description = "Python __doc__ database for RustPython"
version.workspace = true
authors.workspace = true
edition.workspace = true

View File

@@ -17,9 +17,9 @@ num-traits = { workspace = true }
thiserror = { workspace = true }
libffi = { workspace = true }
cranelift = "0.129.1"
cranelift-jit = "0.129.1"
cranelift-module = "0.129.1"
cranelift = "0.130.0"
cranelift-jit = "0.130.0"
cranelift-module = "0.130.0"
[dev-dependencies]
rustpython-derive = { workspace = true }

View File

@@ -162,7 +162,7 @@ impl<'a, 'b> FunctionCompiler<'a, 'b> {
let target = after
.checked_add(u32::from(arg))
.ok_or(JitCompileError::BadBytecode)?;
Ok(Label::new(target))
Ok(Label::from_u32(target))
}
fn jump_target_backward(
@@ -177,7 +177,7 @@ impl<'a, 'b> FunctionCompiler<'a, 'b> {
let target = after
.checked_sub(u32::from(arg))
.ok_or(JitCompileError::BadBytecode)?;
Ok(Label::new(target))
Ok(Label::from_u32(target))
}
fn instruction_target(
@@ -232,7 +232,7 @@ impl<'a, 'b> FunctionCompiler<'a, 'b> {
let mut in_unreachable_code = false;
for (offset, &raw_instr) in clean_instructions.iter().enumerate() {
let label = Label::new(offset as u32);
let label = Label::from_u32(offset as u32);
let (instruction, arg) = arg_state.get(raw_instr);
// If this is a label that some earlier jump can target,
@@ -624,7 +624,10 @@ impl<'a, 'b> FunctionCompiler<'a, 'b> {
_ => Err(JitCompileError::NotSupported),
}
}
Instruction::ExtendedArg | Instruction::Cache => Ok(()),
Instruction::ExtendedArg
| Instruction::Cache
| Instruction::MakeCell { .. }
| Instruction::CopyFreeVars { .. } => Ok(()),
Instruction::JumpBackward { .. }
| Instruction::JumpBackwardNoInterrupt { .. }
@@ -733,6 +736,28 @@ impl<'a, 'b> FunctionCompiler<'a, 'b> {
let val = self.stack.pop().ok_or(JitCompileError::BadBytecode)?;
self.store_variable(var_num.get(arg), val)
}
Instruction::StoreFastLoadFast { var_nums } => {
let oparg = var_nums.get(arg);
let (store_idx, load_idx) = oparg.indexes();
let val = self.stack.pop().ok_or(JitCompileError::BadBytecode)?;
self.store_variable(store_idx, val)?;
let local = self.variables[load_idx]
.as_ref()
.ok_or(JitCompileError::BadBytecode)?;
self.stack.push(JitValue::from_type_and_value(
local.ty.clone(),
self.builder.use_var(local.var),
));
Ok(())
}
Instruction::StoreFastStoreFast { var_nums } => {
let oparg = var_nums.get(arg);
let (idx1, idx2) = oparg.indexes();
let val1 = self.stack.pop().ok_or(JitCompileError::BadBytecode)?;
self.store_variable(idx1, val1)?;
let val2 = self.stack.pop().ok_or(JitCompileError::BadBytecode)?;
self.store_variable(idx2, val2)
}
Instruction::Swap { i: index } => {
let len = self.stack.len();
let i = len - 1;

View File

@@ -42,6 +42,7 @@ impl Function {
}
}
#[allow(dead_code)]
#[derive(Debug, Clone)]
enum StackValue {
String(String),
@@ -49,6 +50,8 @@ enum StackValue {
Map(HashMap<Wtf8Buf, StackValue>),
Code(Box<CodeObject>),
Function(Function),
Slice(Box<[StackValue; 3]>),
Frozenset(Vec<StackValue>),
}
impl From<ConstantData> for StackValue {
@@ -59,6 +62,13 @@ impl From<ConstantData> for StackValue {
}
ConstantData::None => StackValue::None,
ConstantData::Code { code } => StackValue::Code(code),
ConstantData::Slice { elements } => {
let [start, stop, step] = *elements;
StackValue::Slice(Box::new([start.into(), stop.into(), step.into()]))
}
ConstantData::Frozenset { elements } => {
StackValue::Frozenset(elements.into_iter().map(Into::into).collect())
}
c => unimplemented!("constant {:?} isn't yet supported in py_function!", c),
}
}

View File

@@ -9,13 +9,13 @@ license = { workspace = true }
rust-version = { workspace = true }
[dependencies]
rustpython-unicode = { workspace = true, default-features = false }
rustpython-wtf8 = { workspace = true }
hexf-parse = "0.2.1"
is-macro.workspace = true
lexical-parse-float = { version = "1.0.6", features = ["format"] }
num-traits = { workspace = true }
unic-ucd-category = { workspace = true }
[dev-dependencies]
rand = { workspace = true }

View File

@@ -1,15 +0,0 @@
use unic_ucd_category::GeneralCategory;
/// According to python following categories aren't printable:
/// * Cc (Other, Control)
/// * Cf (Other, Format)
/// * Cs (Other, Surrogate)
/// * Co (Other, Private Use)
/// * Cn (Other, Not Assigned)
/// * Zl Separator, Line ('\u2028', LINE SEPARATOR)
/// * Zp Separator, Paragraph ('\u2029', PARAGRAPH SEPARATOR)
/// * Zs (Separator, Space) other than ASCII space('\x20').
pub fn is_printable(c: char) -> bool {
let cat = GeneralCategory::of(c);
!(cat.is_other() || cat.is_separator())
}

View File

@@ -204,7 +204,7 @@ impl UnicodeEscape<'_> {
'\\' | '\t' | '\r' | '\n' => 2,
ch if ch < ' ' || ch as u32 == 0x7f => 4, // \xHH
ch if ch.is_ascii() => 1,
ch if crate::char::is_printable(ch) => {
ch if rustpython_unicode::classify::is_repr_printable(ch as u32) => {
// max = std::cmp::max(ch, max);
ch.len_utf8()
}
@@ -238,7 +238,9 @@ impl UnicodeEscape<'_> {
ch if ch.is_ascii() => {
write!(formatter, "\\x{:02x}", ch as u8)
}
ch if crate::char::is_printable(ch) => formatter.write_char(ch),
ch if rustpython_unicode::classify::is_repr_printable(ch as u32) => {
formatter.write_char(ch)
}
'\0'..='\u{ff}' => {
write!(formatter, "\\x{:02x}", ch as u32)
}

View File

@@ -2,7 +2,6 @@
extern crate alloc;
pub mod char;
pub mod complex;
pub mod escape;
pub mod float;

View File

@@ -2,7 +2,7 @@
name = "rustpython-pylib"
description = "A subset of the Python standard library for use with RustPython"
license-file = "Lib/PSF-LICENSE"
include = ["Cargo.toml", "src/**/*.rs", "Lib/", "!Lib/**/test/", "!Lib/**/*.pyc"]
include = ["Cargo.toml", "build.rs", "src/**/*.rs", "Lib/", "!Lib/**/test/", "!Lib/**/*.pyc"]
authors = ["CPython Developers"]
version.workspace = true
edition.workspace = true

View File

@@ -15,6 +15,7 @@ name = "benches"
harness = false
[dependencies]
rustpython-unicode = { workspace = true, default-features = false }
rustpython-wtf8 = { workspace = true }
num_enum = { workspace = true }
bitflags = { workspace = true }

View File

@@ -1,14 +1,10 @@
// good luck to those that follow; here be dragons
use crate::string::{
is_digit, is_linebreak, is_loc_word, is_space, is_uni_digit, is_uni_linebreak, is_uni_space,
is_uni_word, is_word, lower_ascii, lower_locate, lower_unicode, upper_locate, upper_unicode,
};
use super::{MAXREPEAT, SreAtCode, SreCatCode, SreInfo, SreOpcode, StrDrive, StringCursor};
use alloc::{vec, vec::Vec};
use core::{convert::TryFrom, ptr::null};
use optional::Optioned;
use rustpython_unicode::regex as unicode_regex;
#[derive(Debug, Clone, Copy)]
pub struct Request<'a, S> {
@@ -659,10 +655,10 @@ fn _match<S: StrDrive>(req: &Request<'_, S>, state: &mut State, mut ctx: MatchCo
}
SreOpcode::IN => general_op_in!(charset),
SreOpcode::IN_IGNORE => {
general_op_in!(|set, c| charset(set, lower_ascii(c)))
general_op_in!(|set, c| charset(set, unicode_regex::lower_ascii(c)))
}
SreOpcode::IN_UNI_IGNORE => {
general_op_in!(|set, c| charset(set, lower_unicode(c)))
general_op_in!(|set, c| charset(set, unicode_regex::lower_unicode(c)))
}
SreOpcode::IN_LOC_IGNORE => general_op_in!(charset_loc_ignore),
SreOpcode::MARK => {
@@ -803,25 +799,31 @@ fn _match<S: StrDrive>(req: &Request<'_, S>, state: &mut State, mut ctx: MatchCo
SreOpcode::LITERAL => general_op_literal!(|code, c| code == c),
SreOpcode::NOT_LITERAL => general_op_literal!(|code, c| code != c),
SreOpcode::LITERAL_IGNORE => {
general_op_literal!(|code, c| code == lower_ascii(c))
general_op_literal!(|code, c| code == unicode_regex::lower_ascii(c))
}
SreOpcode::NOT_LITERAL_IGNORE => {
general_op_literal!(|code, c| code != lower_ascii(c))
general_op_literal!(|code, c| code != unicode_regex::lower_ascii(c))
}
SreOpcode::LITERAL_UNI_IGNORE => {
general_op_literal!(|code, c| code == lower_unicode(c))
general_op_literal!(|code, c| code == unicode_regex::lower_unicode(c))
}
SreOpcode::NOT_LITERAL_UNI_IGNORE => {
general_op_literal!(|code, c| code != lower_unicode(c))
general_op_literal!(|code, c| code != unicode_regex::lower_unicode(c))
}
SreOpcode::LITERAL_LOC_IGNORE => general_op_literal!(char_loc_ignore),
SreOpcode::NOT_LITERAL_LOC_IGNORE => {
general_op_literal!(|code, c| !char_loc_ignore(code, c))
}
SreOpcode::GROUPREF => general_op_groupref!(|x| x),
SreOpcode::GROUPREF_IGNORE => general_op_groupref!(lower_ascii),
SreOpcode::GROUPREF_LOC_IGNORE => general_op_groupref!(lower_locate),
SreOpcode::GROUPREF_UNI_IGNORE => general_op_groupref!(lower_unicode),
SreOpcode::GROUPREF_IGNORE => {
general_op_groupref!(unicode_regex::lower_ascii)
}
SreOpcode::GROUPREF_LOC_IGNORE => {
general_op_groupref!(unicode_regex::lower_locale)
}
SreOpcode::GROUPREF_UNI_IGNORE => {
general_op_groupref!(unicode_regex::lower_unicode)
}
SreOpcode::GROUPREF_EXISTS => {
let (group_start, group_end) =
state.marks.get(ctx.peek_code(req, 1) as usize);
@@ -1125,7 +1127,7 @@ impl MatchContext {
}
fn at_linebreak<S: StrDrive>(&self, req: &Request<'_, S>) -> bool {
!self.at_end(req) && is_linebreak(self.peek_char::<S>())
!self.at_end(req) && unicode_regex::is_linebreak(self.peek_char::<S>())
}
fn at_boundary<S: StrDrive, F: FnMut(u32) -> bool>(
@@ -1192,54 +1194,56 @@ impl MatchContext {
fn at<S: StrDrive>(req: &Request<'_, S>, ctx: &MatchContext, at_code: SreAtCode) -> bool {
match at_code {
SreAtCode::BEGINNING | SreAtCode::BEGINNING_STRING => ctx.at_beginning(),
SreAtCode::BEGINNING_LINE => ctx.at_beginning() || is_linebreak(ctx.back_peek_char::<S>()),
SreAtCode::BOUNDARY => ctx.at_boundary(req, is_word),
SreAtCode::NON_BOUNDARY => ctx.at_non_boundary(req, is_word),
SreAtCode::BEGINNING_LINE => {
ctx.at_beginning() || unicode_regex::is_linebreak(ctx.back_peek_char::<S>())
}
SreAtCode::BOUNDARY => ctx.at_boundary(req, unicode_regex::is_word),
SreAtCode::NON_BOUNDARY => ctx.at_non_boundary(req, unicode_regex::is_word),
SreAtCode::END => {
(ctx.remaining_chars(req) == 1 && ctx.at_linebreak(req)) || ctx.at_end(req)
}
SreAtCode::END_LINE => ctx.at_linebreak(req) || ctx.at_end(req),
SreAtCode::END_STRING => ctx.at_end(req),
SreAtCode::LOC_BOUNDARY => ctx.at_boundary(req, is_loc_word),
SreAtCode::LOC_NON_BOUNDARY => ctx.at_non_boundary(req, is_loc_word),
SreAtCode::UNI_BOUNDARY => ctx.at_boundary(req, is_uni_word),
SreAtCode::UNI_NON_BOUNDARY => ctx.at_non_boundary(req, is_uni_word),
SreAtCode::LOC_BOUNDARY => ctx.at_boundary(req, unicode_regex::is_locale_word),
SreAtCode::LOC_NON_BOUNDARY => ctx.at_non_boundary(req, unicode_regex::is_locale_word),
SreAtCode::UNI_BOUNDARY => ctx.at_boundary(req, unicode_regex::is_unicode_word),
SreAtCode::UNI_NON_BOUNDARY => ctx.at_non_boundary(req, unicode_regex::is_unicode_word),
}
}
fn char_loc_ignore(code: u32, c: u32) -> bool {
code == c || code == lower_locate(c) || code == upper_locate(c)
code == c || code == unicode_regex::lower_locale(c) || code == unicode_regex::upper_locale(c)
}
fn charset_loc_ignore(set: &[u32], c: u32) -> bool {
let lo = lower_locate(c);
let lo = unicode_regex::lower_locale(c);
if charset(set, c) {
return true;
}
let up = upper_locate(c);
let up = unicode_regex::upper_locale(c);
up != lo && charset(set, up)
}
fn category(cat_code: SreCatCode, c: u32) -> bool {
match cat_code {
SreCatCode::DIGIT => is_digit(c),
SreCatCode::NOT_DIGIT => !is_digit(c),
SreCatCode::SPACE => is_space(c),
SreCatCode::NOT_SPACE => !is_space(c),
SreCatCode::WORD => is_word(c),
SreCatCode::NOT_WORD => !is_word(c),
SreCatCode::LINEBREAK => is_linebreak(c),
SreCatCode::NOT_LINEBREAK => !is_linebreak(c),
SreCatCode::LOC_WORD => is_loc_word(c),
SreCatCode::LOC_NOT_WORD => !is_loc_word(c),
SreCatCode::UNI_DIGIT => is_uni_digit(c),
SreCatCode::UNI_NOT_DIGIT => !is_uni_digit(c),
SreCatCode::UNI_SPACE => is_uni_space(c),
SreCatCode::UNI_NOT_SPACE => !is_uni_space(c),
SreCatCode::UNI_WORD => is_uni_word(c),
SreCatCode::UNI_NOT_WORD => !is_uni_word(c),
SreCatCode::UNI_LINEBREAK => is_uni_linebreak(c),
SreCatCode::UNI_NOT_LINEBREAK => !is_uni_linebreak(c),
SreCatCode::DIGIT => unicode_regex::is_digit(c),
SreCatCode::NOT_DIGIT => !unicode_regex::is_digit(c),
SreCatCode::SPACE => unicode_regex::is_space(c),
SreCatCode::NOT_SPACE => !unicode_regex::is_space(c),
SreCatCode::WORD => unicode_regex::is_word(c),
SreCatCode::NOT_WORD => !unicode_regex::is_word(c),
SreCatCode::LINEBREAK => unicode_regex::is_linebreak(c),
SreCatCode::NOT_LINEBREAK => !unicode_regex::is_linebreak(c),
SreCatCode::LOC_WORD => unicode_regex::is_locale_word(c),
SreCatCode::LOC_NOT_WORD => !unicode_regex::is_locale_word(c),
SreCatCode::UNI_DIGIT => unicode_regex::is_unicode_digit(c),
SreCatCode::UNI_NOT_DIGIT => !unicode_regex::is_unicode_digit(c),
SreCatCode::UNI_SPACE => unicode_regex::is_unicode_space(c),
SreCatCode::UNI_NOT_SPACE => !unicode_regex::is_unicode_space(c),
SreCatCode::UNI_WORD => unicode_regex::is_unicode_word(c),
SreCatCode::UNI_NOT_WORD => !unicode_regex::is_unicode_word(c),
SreCatCode::UNI_LINEBREAK => unicode_regex::is_unicode_linebreak(c),
SreCatCode::UNI_NOT_LINEBREAK => !unicode_regex::is_unicode_linebreak(c),
}
}
@@ -1320,7 +1324,7 @@ fn charset(set: &[u32], ch: u32) -> bool {
if set[i + 1] <= ch && ch <= set[i + 2] {
return ok;
}
let ch = upper_unicode(ch);
let ch = unicode_regex::upper_unicode(ch);
if set[i + 1] <= ch && ch <= set[i + 2] {
return ok;
}
@@ -1368,10 +1372,14 @@ fn _count<S: StrDrive>(
general_count_literal(req, ctx, end, |code, c| code != c);
}
SreOpcode::LITERAL_IGNORE => {
general_count_literal(req, ctx, end, |code, c| code == lower_ascii(c));
general_count_literal(req, ctx, end, |code, c| {
code == unicode_regex::lower_ascii(c)
});
}
SreOpcode::NOT_LITERAL_IGNORE => {
general_count_literal(req, ctx, end, |code, c| code != lower_ascii(c));
general_count_literal(req, ctx, end, |code, c| {
code != unicode_regex::lower_ascii(c)
});
}
SreOpcode::LITERAL_LOC_IGNORE => {
general_count_literal(req, ctx, end, char_loc_ignore);
@@ -1380,10 +1388,14 @@ fn _count<S: StrDrive>(
general_count_literal(req, ctx, end, |code, c| !char_loc_ignore(code, c));
}
SreOpcode::LITERAL_UNI_IGNORE => {
general_count_literal(req, ctx, end, |code, c| code == lower_unicode(c));
general_count_literal(req, ctx, end, |code, c| {
code == unicode_regex::lower_unicode(c)
});
}
SreOpcode::NOT_LITERAL_UNI_IGNORE => {
general_count_literal(req, ctx, end, |code, c| code != lower_unicode(c));
general_count_literal(req, ctx, end, |code, c| {
code != unicode_regex::lower_unicode(c)
});
}
_ => {
/* General case */

View File

@@ -331,136 +331,3 @@ const fn utf8_is_cont_byte(byte: u8) -> bool {
/// Mask of the value bits of a continuation byte.
const CONT_MASK: u8 = 0b0011_1111;
const fn is_py_ascii_whitespace(b: u8) -> bool {
matches!(b, b'\t' | b'\n' | b'\x0C' | b'\r' | b' ' | b'\x0B')
}
#[inline]
pub(crate) fn is_word(ch: u32) -> bool {
ch == '_' as u32
|| u8::try_from(ch)
.map(|x| x.is_ascii_alphanumeric())
.unwrap_or(false)
}
#[inline]
pub(crate) fn is_space(ch: u32) -> bool {
u8::try_from(ch)
.map(is_py_ascii_whitespace)
.unwrap_or(false)
}
#[inline]
pub(crate) fn is_digit(ch: u32) -> bool {
u8::try_from(ch)
.map(|x| x.is_ascii_digit())
.unwrap_or(false)
}
#[inline]
pub(crate) fn is_loc_alnum(ch: u32) -> bool {
// FIXME: Ignore the locales
u8::try_from(ch)
.map(|x| x.is_ascii_alphanumeric())
.unwrap_or(false)
}
#[inline]
pub(crate) fn is_loc_word(ch: u32) -> bool {
ch == '_' as u32 || is_loc_alnum(ch)
}
#[inline]
pub(crate) const fn is_linebreak(ch: u32) -> bool {
ch == '\n' as u32
}
#[inline]
pub fn lower_ascii(ch: u32) -> u32 {
u8::try_from(ch)
.map(|x| x.to_ascii_lowercase() as u32)
.unwrap_or(ch)
}
#[inline]
pub(crate) fn lower_locate(ch: u32) -> u32 {
// FIXME: Ignore the locales
lower_ascii(ch)
}
#[inline]
pub(crate) fn upper_locate(ch: u32) -> u32 {
// FIXME: Ignore the locales
u8::try_from(ch)
.map(|x| x.to_ascii_uppercase() as u32)
.unwrap_or(ch)
}
#[inline]
pub(crate) fn is_uni_digit(ch: u32) -> bool {
// TODO: check with cpython
char::try_from(ch)
.map(|x| x.is_ascii_digit())
.unwrap_or(false)
}
#[inline]
pub(crate) fn is_uni_space(ch: u32) -> bool {
// TODO: check with cpython
is_space(ch)
|| matches!(
ch,
0x0009
| 0x000A
| 0x000B
| 0x000C
| 0x000D
| 0x001C
| 0x001D
| 0x001E
| 0x001F
| 0x0020
| 0x0085
| 0x00A0
| 0x1680
| 0x2000
| 0x2001
| 0x2002
| 0x2003
| 0x2004
| 0x2005
| 0x2006
| 0x2007
| 0x2008
| 0x2009
| 0x200A
| 0x2028
| 0x2029
| 0x202F
| 0x205F
| 0x3000
)
}
#[inline]
pub(crate) const fn is_uni_linebreak(ch: u32) -> bool {
matches!(
ch,
0x000A | 0x000B | 0x000C | 0x000D | 0x001C | 0x001D | 0x001E | 0x0085 | 0x2028 | 0x2029
)
}
#[inline]
pub(crate) fn is_uni_alnum(ch: u32) -> bool {
// TODO: check with cpython
char::try_from(ch)
.map(|x| x.is_alphanumeric())
.unwrap_or(false)
}
#[inline]
pub(crate) fn is_uni_word(ch: u32) -> bool {
ch == '_' as u32 || is_uni_alnum(ch)
}
#[inline]
pub fn lower_unicode(ch: u32) -> u32 {
// TODO: check with cpython
char::try_from(ch)
.map(|x| x.to_lowercase().next().unwrap() as u32)
.unwrap_or(ch)
}
#[inline]
pub fn upper_unicode(ch: u32) -> u32 {
// TODO: check with cpython
char::try_from(ch)
.map(|x| x.to_uppercase().next().unwrap() as u32)
.unwrap_or(ch)
}

View File

@@ -28,6 +28,7 @@ flame-it = ["flame"]
[dependencies]
# rustpython crates
rustpython-derive = { workspace = true }
rustpython-unicode = { workspace = true, features = ["casefold"] }
rustpython-vm = { workspace = true, default-features = false, features = ["compiler"]}
rustpython-common = { workspace = true }
@@ -76,16 +77,6 @@ pbkdf2 = { version = "0.12", features = ["hmac"] }
constant_time_eq = { workspace = true }
## unicode stuff
unicode_names2 = { workspace = true }
# update version all at the same time
unic-char-property = { workspace = true }
unic-normal = { workspace = true }
unic-ucd-bidi = { workspace = true }
unic-ucd-category = { workspace = true }
unic-ucd-age = { workspace = true }
ucd = "0.1.1"
unicode-bidi-mirroring = { workspace = true }
# compression
adler32 = "1.2.0"
crc32fast = "1.3.2"
@@ -134,12 +125,12 @@ x509-parser = { version = "0.18", optional = true }
der = { version = "0.7", features = ["alloc", "oid"], optional = true }
pem-rfc7468 = { version = "1.0", features = ["alloc"], optional = true }
webpki-roots = { version = "1.0", optional = true }
aws-lc-rs = { version = "1.16.0", optional = true }
aws-lc-rs = { version = "1.16.2", optional = true }
oid-registry = { version = "0.8", features = ["x509", "pkcs1", "nist_algs"], optional = true }
pkcs8 = { version = "0.10", features = ["encryption", "pkcs5", "pem"], optional = true }
[target.'cfg(not(any(target_os = "android", target_arch = "wasm32")))'.dependencies]
libsqlite3-sys = { version = "0.36", features = ["bundled"], optional = true }
libsqlite3-sys = { version = "0.37", features = ["bundled"], optional = true }
liblzma = "0.4"
liblzma-sys = "0.4"

View File

@@ -1204,7 +1204,7 @@ mod mmap {
// Check if this is a Named mmap - these cannot be resized
if let Some(MmapObj::Named(_)) = mmap_guard.as_ref() {
return Err(vm.new_system_error("mmap: cannot resize a named memory mapping"));
return Err(vm.new_os_error("mmap: cannot resize a named memory mapping"));
}
let is_anonymous = handle == INVALID_HANDLE_VALUE as isize;

View File

@@ -6,55 +6,30 @@
pub(crate) use unicodedata::module_def;
use crate::vm::{
PyObject, PyResult, VirtualMachine, builtins::PyStr, convert::TryFromBorrowedObject,
};
enum NormalizeForm {
Nfc,
Nfkc,
Nfd,
Nfkd,
}
impl<'a> TryFromBorrowedObject<'a> for NormalizeForm {
fn try_from_borrowed_object(vm: &VirtualMachine, obj: &'a PyObject) -> PyResult<Self> {
obj.try_value_with(
|form: &PyStr| match form.as_bytes() {
b"NFC" => Ok(Self::Nfc),
b"NFKC" => Ok(Self::Nfkc),
b"NFD" => Ok(Self::Nfd),
b"NFKD" => Ok(Self::Nfkd),
_ => Err(vm.new_value_error("invalid normalization form")),
},
vm,
)
}
}
#[pymodule]
mod unicodedata {
use super::NormalizeForm::*;
use crate::vm::{
Py, PyObjectRef, PyPayload, PyRef, PyResult, VirtualMachine,
builtins::{PyModule, PyStrRef},
function::OptionalArg,
};
use itertools::Itertools;
use rustpython_common::wtf8::{CodePoint, Wtf8Buf};
use ucd::{Codepoint, DecompositionType, EastAsianWidth, Number, NumericType};
use unic_char_property::EnumeratedCharProperty;
use unic_normal::StrNormalForm;
use unic_ucd_age::{Age, UNICODE_VERSION, UnicodeVersion};
use unic_ucd_bidi::BidiClass;
use unic_ucd_category::GeneralCategory;
use unicode_bidi_mirroring::is_mirroring;
use rustpython_unicode::{NormalizeForm, UNICODE_VERSION, UnicodeVersion, data};
fn parse_normalize_form(form: PyStrRef, vm: &VirtualMachine) -> PyResult<NormalizeForm> {
form.to_str()
.ok_or_else(|| vm.new_value_error("invalid normalization form"))?
.parse()
.map_err(|()| vm.new_value_error("invalid normalization form"))
}
pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
__module_exec(vm, module);
// Add UCD methods as module-level functions
let ucd: PyObjectRef = Ucd::new(UNICODE_VERSION).into_ref(&vm.ctx).into();
let ucd: PyObjectRef = PyUcd::new(data::Ucd::default()).into_ref(&vm.ctx).into();
for attr in [
"category",
@@ -80,56 +55,40 @@ mod unicodedata {
#[pyattr]
#[pyclass(name = "UCD")]
#[derive(Debug, PyPayload)]
pub(super) struct Ucd {
unic_version: UnicodeVersion,
}
pub(super) struct PyUcd(data::Ucd);
impl Ucd {
pub const fn new(unic_version: UnicodeVersion) -> Self {
Self { unic_version }
impl PyUcd {
pub const fn new(ucd: data::Ucd) -> Self {
Self(ucd)
}
fn check_age(&self, c: CodePoint) -> bool {
c.to_char()
.is_none_or(|c| Age::of(c).is_some_and(|age| age.actual() <= self.unic_version))
}
fn extract_char(
&self,
character: PyStrRef,
vm: &VirtualMachine,
) -> PyResult<Option<CodePoint>> {
let c = character
fn extract_char(character: PyStrRef, vm: &VirtualMachine) -> PyResult<CodePoint> {
character
.as_wtf8()
.code_points()
.exactly_one()
.map_err(|_| vm.new_type_error("argument must be an unicode character, not str"))?;
Ok(self.check_age(c).then_some(c))
.map_err(|_| vm.new_type_error("argument must be a Unicode character, not str"))
}
}
#[pyclass(flags(DISALLOW_INSTANTIATION))]
impl Ucd {
impl PyUcd {
#[pymethod]
fn category(&self, character: PyStrRef, vm: &VirtualMachine) -> PyResult<String> {
Ok(self
.extract_char(character, vm)?
.map_or(GeneralCategory::Unassigned, |c| {
c.to_char()
.map_or(GeneralCategory::Surrogate, GeneralCategory::of)
})
.abbr_name()
.0
.category(Self::extract_char(character, vm)?.to_u32())
.to_owned())
}
#[pymethod]
fn lookup(&self, name: PyStrRef, vm: &VirtualMachine) -> PyResult<String> {
if let Some(name_str) = name.to_str()
&& let Some(character) = unicode_names2::character(name_str)
&& self.check_age(character.into())
&& let Some(character) = self.0.lookup(name_str)
{
return Ok(character.to_string());
return Ok(char::from_u32(character)
.expect("unicode_names2 only returns Unicode scalar values")
.to_string());
}
Err(vm.new_key_error(
vm.ctx
@@ -145,13 +104,8 @@ mod unicodedata {
default: OptionalArg<PyObjectRef>,
vm: &VirtualMachine,
) -> PyResult {
let c = self.extract_char(character, vm)?;
if let Some(c) = c
&& self.check_age(c)
&& let Some(name) = c.to_char().and_then(unicode_names2::name)
{
return Ok(vm.ctx.new_str(name.to_string()).into());
if let Some(name) = self.0.name(Self::extract_char(character, vm)?.to_u32()) {
return Ok(vm.ctx.new_str(name).into());
}
default.ok_or_else(|| vm.new_value_error("no such name"))
}
@@ -162,14 +116,9 @@ mod unicodedata {
character: PyStrRef,
vm: &VirtualMachine,
) -> PyResult<&'static str> {
let bidi = match self.extract_char(character, vm)? {
Some(c) => c
.to_char()
.map_or(BidiClass::LeftToRight, BidiClass::of)
.abbr_name(),
None => "",
};
Ok(bidi)
Ok(self
.0
.bidirectional(Self::extract_char(character, vm)?.to_u32()))
}
/// NOTE: This function uses 9.0.0 database instead of 3.2.0
@@ -180,76 +129,51 @@ mod unicodedata {
vm: &VirtualMachine,
) -> PyResult<&'static str> {
Ok(self
.extract_char(character, vm)?
.and_then(|c| c.to_char())
.map_or(EastAsianWidth::Neutral, |c| c.east_asian_width())
.abbr_name())
.0
.east_asian_width(Self::extract_char(character, vm)?.to_u32()))
}
#[pymethod]
fn normalize(&self, form: super::NormalizeForm, unistr: PyStrRef) -> PyResult<Wtf8Buf> {
let text = unistr.as_wtf8();
let normalized_text = match form {
Nfc => text.map_utf8(|s| s.nfc()).collect(),
Nfkc => text.map_utf8(|s| s.nfkc()).collect(),
Nfd => text.map_utf8(|s| s.nfd()).collect(),
Nfkd => text.map_utf8(|s| s.nfkd()).collect(),
};
Ok(normalized_text)
fn normalize(
&self,
form: PyStrRef,
unistr: PyStrRef,
vm: &VirtualMachine,
) -> PyResult<Wtf8Buf> {
Ok(self
.0
.normalize(parse_normalize_form(form, vm)?, unistr.as_wtf8()))
}
#[pymethod]
fn is_normalized(&self, form: super::NormalizeForm, unistr: PyStrRef) -> PyResult<bool> {
let text = unistr.as_wtf8();
let normalized: Wtf8Buf = match form {
Nfc => text.map_utf8(|s| s.nfc()).collect(),
Nfkc => text.map_utf8(|s| s.nfkc()).collect(),
Nfd => text.map_utf8(|s| s.nfd()).collect(),
Nfkd => text.map_utf8(|s| s.nfkd()).collect(),
};
Ok(text == &*normalized)
fn is_normalized(
&self,
form: PyStrRef,
unistr: PyStrRef,
vm: &VirtualMachine,
) -> PyResult<bool> {
Ok(self
.0
.is_normalized(parse_normalize_form(form, vm)?, unistr.as_wtf8()))
}
#[pymethod]
fn mirrored(&self, character: PyStrRef, vm: &VirtualMachine) -> PyResult<i32> {
match self.extract_char(character, vm)? {
Some(c) => {
if let Some(ch) = c.to_char() {
// Check if the character is mirrored in bidirectional text using Unicode standard
Ok(if is_mirroring(ch) { 1 } else { 0 })
} else {
Ok(0)
}
}
None => Ok(0),
}
Ok(self.0.mirrored(Self::extract_char(character, vm)?.to_u32()) as i32)
}
#[pymethod]
fn combining(&self, character: PyStrRef, vm: &VirtualMachine) -> PyResult<i32> {
fn combining(&self, character: PyStrRef, vm: &VirtualMachine) -> PyResult<u8> {
Ok(self
.extract_char(character, vm)?
.and_then(|c| c.to_char())
.map_or(0, |ch| ch.canonical_combining_class() as i32))
.0
.combining(Self::extract_char(character, vm)?.to_u32()))
}
#[pymethod]
fn decomposition(&self, character: PyStrRef, vm: &VirtualMachine) -> PyResult<String> {
let ch = match self.extract_char(character, vm)?.and_then(|c| c.to_char()) {
Some(ch) => ch,
None => return Ok(String::new()),
};
let chars: Vec<char> = ch.decomposition_map().collect();
// If decomposition maps to just the character itself, there's no decomposition
if chars.len() == 1 && chars[0] == ch {
return Ok(String::new());
}
let hex_parts = chars.iter().map(|c| format!("{:04X}", *c as u32)).join(" ");
let tag = match ch.decomposition_type() {
Some(DecompositionType::Canonical) | None => return Ok(hex_parts),
Some(dt) => decomposition_type_tag(dt),
};
Ok(format!("<{tag}> {hex_parts}"))
Ok(self
.0
.decomposition(Self::extract_char(character, vm)?.to_u32()))
}
#[pymethod]
@@ -259,15 +183,8 @@ mod unicodedata {
default: OptionalArg<PyObjectRef>,
vm: &VirtualMachine,
) -> PyResult {
let ch = self.extract_char(character, vm)?.and_then(|c| c.to_char());
if let Some(ch) = ch
&& matches!(
ch.numeric_type(),
Some(NumericType::Decimal) | Some(NumericType::Digit)
)
&& let Some(Number::Integer(n)) = ch.numeric_value()
{
return Ok(vm.ctx.new_int(n).into());
if let Some(value) = self.0.digit(Self::extract_char(character, vm)?.to_u32()) {
return Ok(vm.ctx.new_int(value).into());
}
default.ok_or_else(|| vm.new_value_error("not a digit"))
}
@@ -279,12 +196,8 @@ mod unicodedata {
default: OptionalArg<PyObjectRef>,
vm: &VirtualMachine,
) -> PyResult {
let ch = self.extract_char(character, vm)?.and_then(|c| c.to_char());
if let Some(ch) = ch
&& ch.numeric_type() == Some(NumericType::Decimal)
&& let Some(Number::Integer(n)) = ch.numeric_value()
{
return Ok(vm.ctx.new_int(n).into());
if let Some(value) = self.0.decimal(Self::extract_char(character, vm)?.to_u32()) {
return Ok(vm.ctx.new_int(value).into());
}
default.ok_or_else(|| vm.new_value_error("not a decimal"))
}
@@ -296,75 +209,29 @@ mod unicodedata {
default: OptionalArg<PyObjectRef>,
vm: &VirtualMachine,
) -> PyResult {
let ch = self.extract_char(character, vm)?.and_then(|c| c.to_char());
if let Some(ch) = ch {
match ch.numeric_value() {
Some(Number::Integer(n)) => {
return Ok(vm.ctx.new_float(n as f64).into());
}
Some(Number::Rational(num, den)) => {
return Ok(vm.ctx.new_float(num as f64 / den as f64).into());
}
None => {}
}
if let Some(value) = self.0.numeric(Self::extract_char(character, vm)?.to_u32()) {
let value = match value {
data::NumericValue::Integer(n) => n as f64,
data::NumericValue::Rational(num, den) => num as f64 / den as f64,
};
return Ok(vm.ctx.new_float(value).into());
}
default.ok_or_else(|| vm.new_value_error("not a numeric character"))
}
#[pygetset]
fn unidata_version(&self) -> String {
self.unic_version.to_string()
}
}
fn decomposition_type_tag(dt: DecompositionType) -> &'static str {
match dt {
DecompositionType::Canonical => "canonical",
DecompositionType::Compat => "compat",
DecompositionType::Circle => "circle",
DecompositionType::Final => "final",
DecompositionType::Font => "font",
DecompositionType::Fraction => "fraction",
DecompositionType::Initial => "initial",
DecompositionType::Isolated => "isolated",
DecompositionType::Medial => "medial",
DecompositionType::Narrow => "narrow",
DecompositionType::Nobreak => "noBreak",
DecompositionType::Small => "small",
DecompositionType::Square => "square",
DecompositionType::Sub => "sub",
DecompositionType::Super => "super",
DecompositionType::Vertical => "vertical",
DecompositionType::Wide => "wide",
}
}
trait EastAsianWidthAbbrName {
fn abbr_name(&self) -> &'static str;
}
impl EastAsianWidthAbbrName for EastAsianWidth {
fn abbr_name(&self) -> &'static str {
match self {
Self::Narrow => "Na",
Self::Wide => "W",
Self::Neutral => "N",
Self::Ambiguous => "A",
Self::FullWidth => "F",
Self::HalfWidth => "H",
}
self.0.unicode_version().to_string()
}
}
#[pyattr]
fn ucd_3_2_0(vm: &VirtualMachine) -> PyRef<Ucd> {
Ucd {
unic_version: UnicodeVersion {
major: 3,
minor: 2,
micro: 0,
},
}
fn ucd_3_2_0(vm: &VirtualMachine) -> PyRef<PyUcd> {
PyUcd::new(data::Ucd::new(UnicodeVersion {
major: 3,
minor: 2,
micro: 0,
}))
.into_ref(&vm.ctx)
}

29
crates/unicode/Cargo.toml Normal file
View File

@@ -0,0 +1,29 @@
[package]
name = "rustpython-unicode"
description = "Shared Unicode semantics and data for RustPython and related Python tooling."
version.workspace = true
authors.workspace = true
edition.workspace = true
rust-version.workspace = true
repository.workspace = true
license.workspace = true
[features]
default = ["std", "casefold"]
std = []
casefold = ["std", "dep:caseless"]
[dependencies]
rustpython-wtf8 = { workspace = true }
icu_normalizer = { workspace = true }
icu_properties = { workspace = true }
itertools = { workspace = true }
unicode-casing = { workspace = true }
unicode_names2 = { version = "2.0.0", default-features = false, features = ["no_std"] }
unic-ucd-age = { workspace = true }
ucd = "0.1.1"
caseless = { version = "0.2.2", optional = true }
[lints]
workspace = true

111
crates/unicode/src/case.rs Normal file
View File

@@ -0,0 +1,111 @@
#[cfg(feature = "casefold")]
use alloc::string::String;
#[cfg(feature = "casefold")]
use rustpython_wtf8::Wtf8Chunk;
use rustpython_wtf8::{Wtf8, Wtf8Buf};
use unicode_casing::CharExt;
use crate::char_from_codepoint;
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct CaseMapping {
len: u8,
codepoints: [u32; 3],
}
impl CaseMapping {
pub const fn identity(cp: u32) -> Self {
Self {
len: 1,
codepoints: [cp, 0, 0],
}
}
pub const fn first(self) -> Option<u32> {
if self.len == 0 {
None
} else {
Some(self.codepoints[0])
}
}
pub fn iter(self) -> impl Iterator<Item = u32> {
self.codepoints.into_iter().take(usize::from(self.len))
}
}
fn mapping_from_chars(chars: impl Iterator<Item = char>) -> CaseMapping {
let mut codepoints = [0; 3];
let mut len = 0;
for ch in chars.take(codepoints.len()) {
codepoints[len] = ch as u32;
len += 1;
}
CaseMapping {
len: len as u8,
codepoints,
}
}
#[cfg(feature = "casefold")]
fn mapping_from_string(text: String) -> CaseMapping {
mapping_from_chars(text.chars())
}
pub fn to_lowercase(cp: u32) -> CaseMapping {
char_from_codepoint(cp).map_or_else(
|| CaseMapping::identity(cp),
|ch| mapping_from_chars(ch.to_lowercase()),
)
}
pub fn to_uppercase(cp: u32) -> CaseMapping {
char_from_codepoint(cp).map_or_else(
|| CaseMapping::identity(cp),
|ch| mapping_from_chars(ch.to_uppercase()),
)
}
pub fn to_titlecase(cp: u32) -> CaseMapping {
char_from_codepoint(cp).map_or_else(
|| CaseMapping::identity(cp),
|ch| mapping_from_chars(ch.to_titlecase()),
)
}
pub fn to_lowercase_wtf8(text: &Wtf8) -> Wtf8Buf {
text.map_utf8(|s| s.chars().flat_map(char::to_lowercase))
.collect()
}
pub fn to_uppercase_wtf8(text: &Wtf8) -> Wtf8Buf {
text.map_utf8(|s| s.chars().flat_map(char::to_uppercase))
.collect()
}
#[cfg(feature = "casefold")]
pub fn casefold(cp: u32) -> CaseMapping {
char_from_codepoint(cp).map_or_else(
|| CaseMapping::identity(cp),
|ch| {
let mut buf = [0; 4];
mapping_from_string(caseless::default_case_fold_str(ch.encode_utf8(&mut buf)))
},
)
}
#[cfg(feature = "casefold")]
pub fn casefold_str(text: &str) -> String {
caseless::default_case_fold_str(text)
}
#[cfg(feature = "casefold")]
pub fn casefold_wtf8(text: &Wtf8) -> Wtf8Buf {
text.chunks()
.map(|chunk| match chunk {
Wtf8Chunk::Utf8(s) => Wtf8Buf::from_string(casefold_str(s)),
Wtf8Chunk::Surrogate(c) => Wtf8Buf::from(c),
})
.collect()
}

View File

@@ -0,0 +1,67 @@
use icu_properties::props::{BidiClass, EnumeratedProperty, GeneralCategory};
use ucd::{Codepoint, NumericType};
use crate::{char_from_codepoint, is_surrogate};
pub fn general_category(cp: u32) -> GeneralCategory {
if is_surrogate(cp) {
GeneralCategory::Surrogate
} else {
char_from_codepoint(cp).map_or(GeneralCategory::Unassigned, GeneralCategory::for_char)
}
}
pub fn is_alpha(cp: u32) -> bool {
char_from_codepoint(cp).is_some_and(char::is_alphabetic)
}
pub fn is_alnum(cp: u32) -> bool {
char_from_codepoint(cp).is_some_and(char::is_alphanumeric)
}
pub fn is_decimal(cp: u32) -> bool {
matches!(general_category(cp), GeneralCategory::DecimalNumber)
}
pub fn is_digit(cp: u32) -> bool {
char_from_codepoint(cp).is_some_and(|ch| {
matches!(
ch.numeric_type(),
Some(NumericType::Decimal) | Some(NumericType::Digit)
)
})
}
pub fn is_numeric(cp: u32) -> bool {
char_from_codepoint(cp).is_some_and(|ch| ch.numeric_value().is_some())
}
pub fn is_space(cp: u32) -> bool {
char_from_codepoint(cp).is_some_and(|ch| {
matches!(general_category(cp), GeneralCategory::SpaceSeparator)
|| matches!(
BidiClass::for_char(ch),
BidiClass::WhiteSpace | BidiClass::ParagraphSeparator | BidiClass::SegmentSeparator
)
})
}
/// Python's `str.isprintable()` semantics, which treat ASCII space as printable.
pub fn is_printable(cp: u32) -> bool {
cp == '\u{0020}' as u32 || is_repr_printable(cp)
}
/// Repr/escape printable semantics, which exclude all Unicode space separators.
pub fn is_repr_printable(cp: u32) -> bool {
!matches!(
general_category(cp),
GeneralCategory::SpaceSeparator
| GeneralCategory::LineSeparator
| GeneralCategory::ParagraphSeparator
| GeneralCategory::Control
| GeneralCategory::Format
| GeneralCategory::Surrogate
| GeneralCategory::PrivateUse
| GeneralCategory::Unassigned
)
}

230
crates/unicode/src/data.rs Normal file
View File

@@ -0,0 +1,230 @@
use alloc::{format, string::String, vec::Vec};
use icu_properties::{
CodePointSetData,
props::{
BidiClass, BidiMirrored, CanonicalCombiningClass, EastAsianWidth, EnumeratedProperty,
NamedEnumeratedProperty,
},
};
use itertools::Itertools;
use ucd::{Codepoint, DecompositionType, Number, NumericType};
use unic_ucd_age::{Age, UNICODE_VERSION, UnicodeVersion};
use crate::{char_from_codepoint, classify, is_surrogate};
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum NumericValue {
Integer(i64),
Rational(i64, i64),
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Ucd {
unic_version: UnicodeVersion,
}
impl Default for Ucd {
fn default() -> Self {
Self::new(UNICODE_VERSION)
}
}
impl Ucd {
pub const fn new(unic_version: UnicodeVersion) -> Self {
Self { unic_version }
}
pub const fn unicode_version(&self) -> UnicodeVersion {
self.unic_version
}
pub fn category(&self, cp: u32) -> &'static str {
if self.contains(cp) {
category(cp)
} else {
"Cn"
}
}
pub fn lookup(&self, name: &str) -> Option<u32> {
let cp = lookup(name)?;
self.contains(cp).then_some(cp)
}
pub fn name(&self, cp: u32) -> Option<String> {
self.contains(cp).then(|| name(cp)).flatten()
}
pub fn bidirectional(&self, cp: u32) -> &'static str {
if self.contains(cp) {
bidirectional(cp)
} else {
""
}
}
pub fn east_asian_width(&self, cp: u32) -> &'static str {
if self.contains(cp) {
east_asian_width(cp)
} else {
"N"
}
}
pub fn normalize(
&self,
form: crate::NormalizeForm,
text: &rustpython_wtf8::Wtf8,
) -> rustpython_wtf8::Wtf8Buf {
crate::normalize::normalize(form, text)
}
pub fn is_normalized(&self, form: crate::NormalizeForm, text: &rustpython_wtf8::Wtf8) -> bool {
crate::normalize::is_normalized(form, text)
}
pub fn mirrored(&self, cp: u32) -> bool {
self.contains(cp) && mirrored(cp)
}
pub fn combining(&self, cp: u32) -> u8 {
if self.contains(cp) { combining(cp) } else { 0 }
}
pub fn decomposition(&self, cp: u32) -> String {
if self.contains(cp) {
decomposition(cp)
} else {
String::new()
}
}
pub fn digit(&self, cp: u32) -> Option<u32> {
self.contains(cp).then(|| digit(cp)).flatten()
}
pub fn decimal(&self, cp: u32) -> Option<u32> {
self.contains(cp).then(|| decimal(cp)).flatten()
}
pub fn numeric(&self, cp: u32) -> Option<NumericValue> {
self.contains(cp).then(|| numeric(cp)).flatten()
}
fn contains(&self, cp: u32) -> bool {
is_assigned_in_version(cp, self.unic_version)
}
}
pub fn is_assigned_in_version(cp: u32, version: UnicodeVersion) -> bool {
if is_surrogate(cp) {
true
} else {
char_from_codepoint(cp)
.is_some_and(|ch| Age::of(ch).is_some_and(|age| age.actual() <= version))
}
}
pub fn category(cp: u32) -> &'static str {
classify::general_category(cp).short_name()
}
pub fn lookup(name: &str) -> Option<u32> {
unicode_names2::character(name).map(u32::from)
}
pub fn name(cp: u32) -> Option<String> {
char_from_codepoint(cp)
.and_then(unicode_names2::name)
.map(|name| name.collect())
}
pub fn bidirectional(cp: u32) -> &'static str {
char_from_codepoint(cp)
.map_or(BidiClass::LeftToRight, BidiClass::for_char)
.short_name()
}
pub fn east_asian_width(cp: u32) -> &'static str {
char_from_codepoint(cp)
.map_or(EastAsianWidth::Neutral, EastAsianWidth::for_char)
.short_name()
}
pub fn mirrored(cp: u32) -> bool {
char_from_codepoint(cp).is_some_and(|ch| CodePointSetData::new::<BidiMirrored>().contains(ch))
}
pub fn combining(cp: u32) -> u8 {
char_from_codepoint(cp).map_or(0, |ch| {
CanonicalCombiningClass::for_char(ch).to_icu4c_value()
})
}
pub fn decomposition(cp: u32) -> String {
let ch = match char_from_codepoint(cp) {
Some(ch) => ch,
None => return String::new(),
};
let chars: Vec<char> = ch.decomposition_map().collect();
if chars.len() == 1 && chars[0] == ch {
return String::new();
}
let hex_parts = chars.iter().map(|c| format!("{:04X}", *c as u32)).join(" ");
match ch.decomposition_type() {
Some(DecompositionType::Canonical) | None => hex_parts,
Some(dt) => format!("<{}> {hex_parts}", decomposition_type_tag(dt)),
}
}
pub fn digit(cp: u32) -> Option<u32> {
let ch = char_from_codepoint(cp)?;
if matches!(
ch.numeric_type(),
Some(NumericType::Decimal) | Some(NumericType::Digit)
) && let Some(Number::Integer(value)) = ch.numeric_value()
{
return u32::try_from(value).ok();
}
None
}
pub fn decimal(cp: u32) -> Option<u32> {
let ch = char_from_codepoint(cp)?;
if ch.numeric_type() == Some(NumericType::Decimal)
&& let Some(Number::Integer(value)) = ch.numeric_value()
{
return u32::try_from(value).ok();
}
None
}
pub fn numeric(cp: u32) -> Option<NumericValue> {
match char_from_codepoint(cp)?.numeric_value()? {
Number::Integer(value) => Some(NumericValue::Integer(value)),
Number::Rational(num, den) => Some(NumericValue::Rational(num.into(), den.into())),
}
}
fn decomposition_type_tag(dt: DecompositionType) -> &'static str {
match dt {
DecompositionType::Canonical => "canonical",
DecompositionType::Compat => "compat",
DecompositionType::Circle => "circle",
DecompositionType::Final => "final",
DecompositionType::Font => "font",
DecompositionType::Fraction => "fraction",
DecompositionType::Initial => "initial",
DecompositionType::Isolated => "isolated",
DecompositionType::Medial => "medial",
DecompositionType::Narrow => "narrow",
DecompositionType::Nobreak => "noBreak",
DecompositionType::Small => "small",
DecompositionType::Square => "square",
DecompositionType::Sub => "sub",
DecompositionType::Super => "super",
DecompositionType::Vertical => "vertical",
DecompositionType::Wide => "wide",
}
}

View File

@@ -0,0 +1,27 @@
use icu_properties::props::{BinaryProperty, XidContinue, XidStart};
use crate::char_from_codepoint;
pub fn is_xid_start(cp: u32) -> bool {
char_from_codepoint(cp).is_some_and(XidStart::for_char)
}
pub fn is_xid_continue(cp: u32) -> bool {
char_from_codepoint(cp).is_some_and(XidContinue::for_char)
}
pub fn is_python_identifier_start(cp: u32) -> bool {
cp == '_' as u32 || is_xid_start(cp)
}
pub fn is_python_identifier_continue(cp: u32) -> bool {
is_xid_continue(cp)
}
pub fn is_python_identifier(text: &str) -> bool {
let mut chars = text.chars();
let is_identifier_start = chars
.next()
.is_some_and(|ch| is_python_identifier_start(ch as u32));
is_identifier_start && chars.all(|ch| is_python_identifier_continue(ch as u32))
}

77
crates/unicode/src/lib.rs Normal file
View File

@@ -0,0 +1,77 @@
#![cfg_attr(not(feature = "std"), no_std)]
extern crate alloc;
pub mod case;
pub mod classify;
pub mod data;
pub mod identifier;
pub mod normalize;
pub mod regex;
pub use normalize::NormalizeForm;
pub use unic_ucd_age::{UNICODE_VERSION, UnicodeVersion};
use core::char;
pub(crate) fn char_from_codepoint(cp: u32) -> Option<char> {
char::from_u32(cp)
}
pub(crate) const fn is_surrogate(cp: u32) -> bool {
matches!(cp, 0xD800..=0xDFFF)
}
#[cfg(test)]
mod tests {
use alloc::vec::Vec;
use rustpython_wtf8::Wtf8Buf;
use crate::{NormalizeForm, case, classify, data, identifier, normalize, regex};
#[test]
fn printable_and_repr_printable_follow_python_rules() {
assert!(classify::is_printable(' ' as u32));
assert!(!classify::is_repr_printable(' ' as u32));
assert!(!classify::is_printable('\n' as u32));
}
#[test]
fn identifier_and_regex_predicates_share_unicode_tables() {
assert!(identifier::is_python_identifier_start('_' as u32));
assert!(identifier::is_python_identifier("유니코드"));
assert!(regex::is_unicode_word('가' as u32));
assert!(regex::is_unicode_digit('५' as u32));
assert!(regex::is_unicode_space('\u{3000}' as u32));
}
#[test]
fn case_and_normalization_helpers_support_full_mappings() {
let upper: Vec<_> = case::to_uppercase('ß' as u32).iter().collect();
assert_eq!(upper, vec!['S' as u32, 'S' as u32]);
let text = Wtf8Buf::from("e\u{301}");
assert_eq!(
normalize::normalize(NormalizeForm::Nfc, &text),
Wtf8Buf::from("é")
);
assert!(normalize::is_normalized(
NormalizeForm::Nfd,
&normalize::normalize(NormalizeForm::Nfd, &Wtf8Buf::from("é"))
));
}
#[test]
fn unicode_data_queries_match_existing_unicodedata_behavior() {
assert_eq!(data::category('A' as u32), "Lu");
assert_eq!(data::category(0xD800), "Cs");
assert_eq!(data::lookup("SNOWMAN"), Some('☃' as u32));
assert_eq!(data::name('☃' as u32).as_deref(), Some("SNOWMAN"));
assert_eq!(data::decimal('५' as u32), Some(5));
assert_eq!(data::digit('²' as u32), Some(2));
assert_eq!(
data::numeric('⅓' as u32),
Some(data::NumericValue::Rational(1, 3))
);
}
}

View File

@@ -0,0 +1,55 @@
use core::str::FromStr;
use icu_normalizer::{ComposingNormalizerBorrowed, DecomposingNormalizerBorrowed};
use rustpython_wtf8::{Wtf8, Wtf8Buf};
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum NormalizeForm {
Nfc,
Nfkc,
Nfd,
Nfkd,
}
impl FromStr for NormalizeForm {
type Err = ();
fn from_str(s: &str) -> Result<Self, Self::Err> {
match s {
"NFC" => Ok(Self::Nfc),
"NFKC" => Ok(Self::Nfkc),
"NFD" => Ok(Self::Nfd),
"NFKD" => Ok(Self::Nfkd),
_ => Err(()),
}
}
}
pub fn normalize(form: NormalizeForm, text: &Wtf8) -> Wtf8Buf {
match form {
NormalizeForm::Nfc => {
let normalizer = ComposingNormalizerBorrowed::new_nfc();
text.map_utf8(|s| normalizer.normalize_iter(s.chars()))
.collect()
}
NormalizeForm::Nfkc => {
let normalizer = ComposingNormalizerBorrowed::new_nfkc();
text.map_utf8(|s| normalizer.normalize_iter(s.chars()))
.collect()
}
NormalizeForm::Nfd => {
let normalizer = DecomposingNormalizerBorrowed::new_nfd();
text.map_utf8(|s| normalizer.normalize_iter(s.chars()))
.collect()
}
NormalizeForm::Nfkd => {
let normalizer = DecomposingNormalizerBorrowed::new_nfkd();
text.map_utf8(|s| normalizer.normalize_iter(s.chars()))
.collect()
}
}
}
pub fn is_normalized(form: NormalizeForm, text: &Wtf8) -> bool {
let normalized = normalize(form, text);
text == &*normalized
}

View File

@@ -0,0 +1,87 @@
use crate::{case, classify};
const UNDERSCORE: u32 = '_' as u32;
const fn is_py_ascii_whitespace(byte: u8) -> bool {
matches!(byte, b'\t' | b'\n' | b'\x0C' | b'\r' | b' ' | b'\x0B')
}
pub fn is_word(cp: u32) -> bool {
cp == UNDERSCORE
|| u8::try_from(cp)
.map(|byte| byte.is_ascii_alphanumeric())
.unwrap_or(false)
}
pub fn is_space(cp: u32) -> bool {
u8::try_from(cp)
.map(is_py_ascii_whitespace)
.unwrap_or(false)
}
pub fn is_digit(cp: u32) -> bool {
u8::try_from(cp)
.map(|byte| byte.is_ascii_digit())
.unwrap_or(false)
}
pub fn is_locale_alnum(cp: u32) -> bool {
u8::try_from(cp)
.map(|byte| byte.is_ascii_alphanumeric())
.unwrap_or(false)
}
pub fn is_locale_word(cp: u32) -> bool {
cp == UNDERSCORE || is_locale_alnum(cp)
}
pub const fn is_linebreak(cp: u32) -> bool {
cp == '\n' as u32
}
pub fn lower_ascii(cp: u32) -> u32 {
u8::try_from(cp)
.map(|byte| byte.to_ascii_lowercase() as u32)
.unwrap_or(cp)
}
pub fn lower_locale(cp: u32) -> u32 {
lower_ascii(cp)
}
pub fn upper_locale(cp: u32) -> u32 {
u8::try_from(cp)
.map(|byte| byte.to_ascii_uppercase() as u32)
.unwrap_or(cp)
}
pub fn is_unicode_digit(cp: u32) -> bool {
classify::is_decimal(cp)
}
pub fn is_unicode_space(cp: u32) -> bool {
classify::is_space(cp)
}
pub const fn is_unicode_linebreak(cp: u32) -> bool {
matches!(
cp,
0x000A | 0x000B | 0x000C | 0x000D | 0x001C | 0x001D | 0x001E | 0x0085 | 0x2028 | 0x2029
)
}
pub fn is_unicode_alnum(cp: u32) -> bool {
classify::is_alnum(cp)
}
pub fn is_unicode_word(cp: u32) -> bool {
cp == UNDERSCORE || is_unicode_alnum(cp)
}
pub fn lower_unicode(cp: u32) -> u32 {
case::to_lowercase(cp).first().unwrap_or(cp)
}
pub fn upper_unicode(cp: u32) -> u32 {
case::to_uppercase(cp).first().unwrap_or(cp)
}

View File

@@ -41,6 +41,7 @@ ruff_text_size = { workspace = true, optional = true }
rustpython-compiler-core = { workspace = true }
rustpython-literal = { workspace = true }
rustpython-sre_engine = { workspace = true }
rustpython-unicode = { workspace = true, features = ["casefold"] }
ascii = { workspace = true }
ahash = { workspace = true }
@@ -74,7 +75,6 @@ strum_macros = { workspace = true }
thiserror = { workspace = true }
memchr = { workspace = true }
caseless = "0.2.2"
flamer = { version = "0.5", optional = true }
half = "2"
psm = "0.1"
@@ -86,10 +86,6 @@ timsort = "0.1.2"
# TODO: use unic for this; needed for title case:
# https://github.com/RustPython/RustPython/pull/832#discussion_r275428939
unicode-casing = { workspace = true }
# update version all at the same time
unic-ucd-bidi = { workspace = true }
unic-ucd-category = { workspace = true }
unic-ucd-ident = { workspace = true }
[target.'cfg(unix)'.dependencies]
rustix = { workspace = true }
@@ -128,6 +124,7 @@ features = [
"Win32_System_Environment",
"Win32_System_IO",
"Win32_System_Ioctl",
"Win32_System_JobObjects",
"Win32_System_Kernel",
"Win32_System_LibraryLoader",
"Win32_System_Memory",

View File

@@ -396,6 +396,10 @@ pub trait AnyStr {
fn py_zfill(&self, width: isize) -> Vec<u8> {
let width = width.to_usize().unwrap_or(0);
let char_len = self.elements().count();
let width = self
.bytes_len()
.saturating_add(width.saturating_sub(char_len));
rustpython_common::str::zfill(self.as_bytes(), width)
}

View File

@@ -215,6 +215,15 @@ impl PyByteArray {
size_of::<Self>() + self.borrow_buf().len() * size_of::<u8>()
}
#[pyslot]
fn slot_str(zelf: &PyObject, vm: &VirtualMachine) -> PyResult<PyStrRef> {
let zelf = zelf.downcast_ref::<Self>().expect("expected bytearray");
PyBytesInner::warn_on_str("str() on a bytearray instance", vm)?;
let class_name = zelf.class().name();
let repr = zelf.inner().repr_with_name(&class_name, vm)?;
Ok(vm.ctx.new_str(repr))
}
fn __add__(&self, other: ArgBytesLike) -> Self {
self.inner().add(&other.borrow_buf()).into()
}

View File

@@ -224,6 +224,13 @@ impl PyBytes {
size_of::<Self>() + self.len() * size_of::<u8>()
}
#[pyslot]
fn slot_str(zelf: &PyObject, vm: &VirtualMachine) -> PyResult<PyStrRef> {
let zelf = zelf.downcast_ref::<Self>().expect("expected bytes");
PyBytesInner::warn_on_str("str() on a bytes instance", vm)?;
Ok(vm.ctx.new_str(zelf.inner.repr_bytes(vm)?))
}
fn __add__(&self, other: ArgBytesLike) -> Vec<u8> {
self.inner.add(&other.borrow_buf())
}

View File

@@ -194,6 +194,12 @@ impl From<Literal> for PyObjectRef {
}
}
impl From<PyObjectRef> for Literal {
fn from(obj: PyObjectRef) -> Self {
Literal(obj)
}
}
fn borrow_obj_constant(obj: &PyObject) -> BorrowedConstant<'_, Literal> {
match_class!(match obj {
ref i @ super::int::PyInt => {
@@ -226,6 +232,23 @@ fn borrow_obj_constant(obj: &PyObject) -> BorrowedConstant<'_, Literal> {
}
super::singletons::PyNone => BorrowedConstant::None,
super::slice::PyEllipsis => BorrowedConstant::Ellipsis,
ref s @ super::slice::PySlice => {
// Constant pool slices always store Some() for start/step (even for None).
// Box::leak the array so it outlives the borrow. Leak is acceptable since
// constant pool objects live for the program's lifetime.
let start = s.start.clone().unwrap();
let stop = s.stop.clone();
let step = s.step.clone().unwrap();
let arr = Box::leak(Box::new([Literal(start), Literal(stop), Literal(step)]));
BorrowedConstant::Slice { elements: arr }
}
ref fs @ super::set::PyFrozenSet => {
// Box::leak the elements so they outlive the borrow. Leak is acceptable since
// constant pool objects live for the program's lifetime.
let elems: Vec<Literal> = fs.elements().into_iter().map(Literal).collect();
let elements = Box::leak(elems.into_boxed_slice());
BorrowedConstant::Frozenset { elements }
}
_ => panic!("unexpected payload for constant python value"),
})
}
@@ -277,6 +300,30 @@ impl ConstantBag for PyObjBag<'_> {
.collect();
ctx.new_tuple(elements).into()
}
BorrowedConstant::Slice { elements } => {
let [start, stop, step] = elements;
let start_obj = self.make_constant(start.borrow_constant()).0;
let stop_obj = self.make_constant(stop.borrow_constant()).0;
let step_obj = self.make_constant(step.borrow_constant()).0;
// Store as PySlice with Some() for all fields (even None values)
// so borrow_obj_constant can reference them.
use crate::builtins::PySlice;
PySlice {
start: Some(start_obj),
stop: stop_obj,
step: Some(step_obj),
}
.into_ref(ctx)
.into()
}
BorrowedConstant::Frozenset { elements: _ } => {
// Creating a frozenset requires VirtualMachine for element hashing.
// PyObjBag only has Context, so we cannot construct PyFrozenSet here.
// Frozenset constants from .pyc are handled by PyMarshalBag which has VM access.
unimplemented!(
"frozenset constant in PyObjBag::make_constant requires VirtualMachine"
)
}
BorrowedConstant::None => ctx.none(),
BorrowedConstant::Ellipsis => ctx.ellipsis.clone().into(),
};
@@ -633,6 +680,38 @@ impl Constructor for PyCode {
)],
> = vec![(loc, loc); instructions.len()].into_boxed_slice();
// Build localspluskinds with cell-local merging
let localspluskinds = {
use rustpython_compiler_core::bytecode::*;
let nlocals = varnames.len();
let ncells = cellvars.len();
let nfrees = freevars.len();
let numdropped = cellvars
.iter()
.filter(|cv| varnames.iter().any(|v| *v == **cv))
.count();
let nlocalsplus = nlocals + ncells - numdropped + nfrees;
let mut kinds = vec![0u8; nlocalsplus];
for kind in kinds.iter_mut().take(nlocals) {
*kind = CO_FAST_LOCAL;
}
let mut cell_numdropped = 0usize;
for (i, cv) in cellvars.iter().enumerate() {
let merged_idx = varnames.iter().position(|v| **v == **cv);
if let Some(local_idx) = merged_idx {
kinds[local_idx] |= CO_FAST_CELL;
cell_numdropped += 1;
} else {
kinds[nlocals + i - cell_numdropped] = CO_FAST_CELL;
}
}
let free_start = nlocals + ncells - numdropped;
for i in 0..nfrees {
kinds[free_start + i] = CO_FAST_FREE;
}
kinds.into_boxed_slice()
};
// Build the CodeObject
let code = CodeObject {
instructions,
@@ -650,12 +729,12 @@ impl Constructor for PyCode {
max_stackdepth: args.stacksize,
obj_name: vm.ctx.intern_str(args.name.as_wtf8()),
qualname: vm.ctx.intern_str(args.qualname.as_wtf8()),
cell2arg: None, // TODO: reuse `fn cell2arg`
constants,
names,
varnames,
cellvars,
freevars,
localspluskinds,
linetable: args.linetable.as_bytes().to_vec().into_boxed_slice(),
exceptiontable: args.exceptiontable.as_bytes().to_vec().into_boxed_slice(),
};
@@ -1237,7 +1316,7 @@ impl PyCode {
.collect(),
cellvars,
freevars,
cell2arg: self.code.cell2arg.clone(),
localspluskinds: self.code.localspluskinds.clone(),
linetable,
exceptiontable,
};
@@ -1252,22 +1331,34 @@ impl PyCode {
let idx = usize::try_from(opcode).map_err(|_| idx_err(vm))?;
let varnames_len = self.code.varnames.len();
let cellvars_len = self.code.cellvars.len();
// Non-parameter cells: cellvars that are NOT also in varnames
let nonparam_cellvars: Vec<_> = self
.code
.cellvars
.iter()
.filter(|s| {
let s_str: &str = s.as_ref();
!self.code.varnames.iter().any(|v| {
let v_str: &str = v.as_ref();
v_str == s_str
})
})
.collect();
let nonparam_len = nonparam_cellvars.len();
let name = if idx < varnames_len {
// Index in varnames
// Index in varnames (includes parameter cells)
self.code.varnames.get(idx).ok_or_else(|| idx_err(vm))?
} else if idx < varnames_len + cellvars_len {
// Index in cellvars
self.code
.cellvars
} else if idx < varnames_len + nonparam_len {
// Index in non-parameter cellvars
*nonparam_cellvars
.get(idx - varnames_len)
.ok_or_else(|| idx_err(vm))?
} else {
// Index in freevars
self.code
.freevars
.get(idx - varnames_len - cellvars_len)
.get(idx - varnames_len - nonparam_len)
.ok_or_else(|| idx_err(vm))?
};
Ok(name.to_object())

View File

@@ -64,7 +64,7 @@ pub struct PyFunction {
code: PyAtomicRef<PyCode>,
globals: PyDictRef,
builtins: PyObjectRef,
closure: Option<PyRef<PyTuple<PyCellRef>>>,
pub(crate) closure: Option<PyRef<PyTuple<PyCellRef>>>,
defaults_and_kwdefaults: PyMutex<(Option<PyTupleRef>, Option<PyDictRef>)>,
name: PyMutex<PyStrRef>,
qualname: PyMutex<PyStrRef>,
@@ -443,13 +443,6 @@ impl PyFunction {
}
}
if let Some(cell2arg) = code.cell2arg.as_deref() {
for (cell_idx, arg_idx) in cell2arg.iter().enumerate().filter(|(_, i)| **i != -1) {
let x = fastlocals[*arg_idx as usize].take();
frame.set_cell_contents(cell_idx, x);
}
}
Ok(())
}
@@ -725,14 +718,6 @@ impl Py<PyFunction> {
}
}
if let Some(cell2arg) = code.cell2arg.as_deref() {
let fastlocals = unsafe { frame.fastlocals_mut() };
for (cell_idx, arg_idx) in cell2arg.iter().enumerate().filter(|(_, i)| **i != -1) {
let x = fastlocals[*arg_idx as usize].take();
frame.set_cell_contents(cell_idx, x);
}
}
frame
}
@@ -780,11 +765,7 @@ pub(crate) fn datastack_frame_size_bytes_for_code(code: &Py<PyCode>) -> Option<u
{
return None;
}
let nlocalsplus = code
.varnames
.len()
.checked_add(code.cellvars.len())?
.checked_add(code.freevars.len())?;
let nlocalsplus = code.localspluskinds.len();
let capacity = nlocalsplus.checked_add(code.max_stackdepth as usize)?;
capacity.checked_mul(core::mem::size_of::<usize>())
}
@@ -1216,6 +1197,17 @@ impl GetAttr for PyBoundMethod {
}
}
impl GetDescriptor for PyBoundMethod {
fn descr_get(
zelf: PyObjectRef,
_obj: Option<PyObjectRef>,
_cls: Option<PyObjectRef>,
_vm: &VirtualMachine,
) -> PyResult {
Ok(zelf)
}
}
#[derive(FromArgs)]
pub struct PyBoundMethodNewArgs {
#[pyarg(positional)]
@@ -1230,8 +1222,14 @@ impl Constructor for PyBoundMethod {
fn py_new(
_cls: &Py<PyType>,
Self::Args { function, object }: Self::Args,
_vm: &VirtualMachine,
vm: &VirtualMachine,
) -> PyResult<Self> {
if !function.is_callable() {
return Err(vm.new_type_error("first argument must be callable".to_owned()));
}
if vm.is_none(&object) {
return Err(vm.new_type_error("instance must not be None".to_owned()));
}
Ok(Self::new(object, function))
}
}
@@ -1258,7 +1256,15 @@ impl PyBoundMethod {
}
#[pyclass(
with(Callable, Comparable, Hashable, GetAttr, Constructor, Representable),
with(
Callable,
Comparable,
Hashable,
GetAttr,
GetDescriptor,
Constructor,
Representable
),
flags(IMMUTABLETYPE, HAS_WEAKREF)
)]
impl PyBoundMethod {
@@ -1266,11 +1272,11 @@ impl PyBoundMethod {
fn __reduce__(
&self,
vm: &VirtualMachine,
) -> (Option<PyObjectRef>, (PyObjectRef, Option<PyObjectRef>)) {
let builtins_getattr = vm.builtins.get_attr("getattr", vm).ok();
) -> PyResult<(PyObjectRef, (PyObjectRef, PyObjectRef))> {
let builtins_getattr = vm.builtins.get_attr("getattr", vm)?;
let func_self = self.object.clone();
let func_name = self.function.get_attr("__name__", vm).ok();
(builtins_getattr, (func_self, func_name))
let func_name = self.function.get_attr("__name__", vm)?;
Ok((builtins_getattr, (func_self, func_name)))
}
#[pygetset]

View File

@@ -41,11 +41,9 @@ use rustpython_common::{
hash,
lock::PyMutex,
str::DeduceStrKind,
wtf8::{CodePoint, Wtf8, Wtf8Buf, Wtf8Chunk, Wtf8Concat},
wtf8::{CodePoint, Wtf8, Wtf8Buf, Wtf8Concat},
};
use unic_ucd_bidi::BidiClass;
use unic_ucd_category::GeneralCategory;
use unic_ucd_ident::{is_xid_continue, is_xid_start};
use unicode_casing::CharExt;
impl<'a> TryFromBorrowedObject<'a> for String {
@@ -697,7 +695,7 @@ impl PyStr {
match self.as_str_kind() {
PyKindStr::Ascii(s) => s.to_ascii_lowercase().into(),
PyKindStr::Utf8(s) => s.to_lowercase().into(),
PyKindStr::Wtf8(w) => w.to_lowercase().into(),
PyKindStr::Wtf8(w) => rustpython_unicode::case::to_lowercase_wtf8(w).into(),
}
}
@@ -705,16 +703,9 @@ impl PyStr {
#[pymethod]
fn casefold(&self) -> Self {
match self.as_str_kind() {
PyKindStr::Ascii(s) => caseless::default_case_fold_str(s.as_str()).into(),
PyKindStr::Utf8(s) => caseless::default_case_fold_str(s).into(),
PyKindStr::Wtf8(w) => w
.chunks()
.map(|c| match c {
Wtf8Chunk::Utf8(s) => Wtf8Buf::from_string(caseless::default_case_fold_str(s)),
Wtf8Chunk::Surrogate(c) => Wtf8Buf::from(c),
})
.collect::<Wtf8Buf>()
.into(),
PyKindStr::Ascii(s) => rustpython_unicode::case::casefold_str(s.as_str()).into(),
PyKindStr::Utf8(s) => rustpython_unicode::case::casefold_str(s).into(),
PyKindStr::Wtf8(w) => rustpython_unicode::case::casefold_wtf8(w).into(),
}
}
@@ -723,7 +714,7 @@ impl PyStr {
match self.as_str_kind() {
PyKindStr::Ascii(s) => s.to_ascii_uppercase().into(),
PyKindStr::Utf8(s) => s.to_uppercase().into(),
PyKindStr::Wtf8(w) => w.to_uppercase().into(),
PyKindStr::Wtf8(w) => rustpython_unicode::case::to_uppercase_wtf8(w).into(),
}
}
@@ -966,7 +957,7 @@ impl PyStr {
#[pymethod]
fn isdecimal(&self) -> bool {
!self.data.is_empty()
&& self.char_all(|c| GeneralCategory::of(c) == GeneralCategory::DecimalNumber)
&& self.char_all(|c| rustpython_unicode::classify::is_decimal(c as u32))
}
fn __mod__(&self, values: PyObjectRef, vm: &VirtualMachine) -> PyResult<Wtf8Buf> {
@@ -1086,17 +1077,12 @@ impl PyStr {
#[pymethod]
fn isprintable(&self) -> bool {
self.char_all(|c| c == '\u{0020}' || rustpython_literal::char::is_printable(c))
self.char_all(|c| rustpython_unicode::classify::is_printable(c as u32))
}
#[pymethod]
fn isspace(&self) -> bool {
use unic_ucd_bidi::bidi_class::abbr_names::*;
!self.data.is_empty()
&& self.char_all(|c| {
GeneralCategory::of(c) == GeneralCategory::SpaceSeparator
|| matches!(BidiClass::of(c), WS | B | S)
})
!self.data.is_empty() && self.char_all(|c| rustpython_unicode::classify::is_space(c as u32))
}
// Return true if all cased characters in the string are lowercase and there is at least one cased character, false otherwise.
@@ -1353,11 +1339,8 @@ impl PyStr {
#[pymethod]
pub fn isidentifier(&self) -> bool {
let Some(s) = self.to_str() else { return false };
let mut chars = s.chars();
let is_identifier_start = chars.next().is_some_and(|c| c == '_' || is_xid_start(c));
// a string is not an identifier if it has whitespace or starts with a number
is_identifier_start && chars.all(is_xid_continue)
self.to_str()
.is_some_and(rustpython_unicode::identifier::is_python_identifier)
}
// https://docs.python.org/3/library/stdtypes.html#str.translate

View File

@@ -7,6 +7,7 @@ See also [CPython source code.](https://github.com/python/cpython/blob/50b48572d
use super::{PyStr, PyType, PyTypeRef};
use crate::{
AsObject, Context, Py, PyObjectRef, PyPayload, PyRef, PyResult, VirtualMachine,
builtins::function::PyCell,
class::PyClassImpl,
common::lock::PyRwLock,
function::{FuncArgs, IntoFuncArgs, OptionalArg},
@@ -86,27 +87,33 @@ impl Initializer for PySuper {
return Err(vm.new_runtime_error("super(): no arguments"));
}
// SAFETY: Frame is current and not concurrently mutated.
use rustpython_compiler_core::bytecode::CO_FAST_CELL;
let obj = unsafe { frame.fastlocals() }[0]
.clone()
.or_else(|| {
if let Some(cell2arg) = frame.code.cell2arg.as_deref() {
cell2arg[..frame.code.cellvars.len()]
.iter()
.enumerate()
.find(|(_, arg_idx)| **arg_idx == 0)
.and_then(|(cell_idx, _)| frame.get_cell_contents(cell_idx))
.and_then(|val| {
// If slot 0 is a merged cell (LOCAL|CELL), extract value from cell
if frame
.code
.localspluskinds
.first()
.is_some_and(|&k| k & CO_FAST_CELL != 0)
{
val.downcast_ref::<PyCell>().and_then(|c| c.get())
} else {
None
Some(val)
}
})
.ok_or_else(|| vm.new_runtime_error("super(): arg[0] deleted"))?;
let mut typ = None;
// Search for __class__ in freevars using localspluskinds
let nlocalsplus = frame.code.localspluskinds.len();
let nfrees = frame.code.freevars.len();
let free_start = nlocalsplus - nfrees;
for (i, var) in frame.code.freevars.iter().enumerate() {
if var.as_bytes() == b"__class__" {
let i = frame.code.cellvars.len() + i;
let class = frame
.get_cell_contents(i)
.get_cell_contents(free_start + i)
.ok_or_else(|| vm.new_runtime_error("super(): empty __class__ cell"))?;
typ = Some(class.downcast().map_err(|o| {
vm.new_type_error(format!(

View File

@@ -1868,14 +1868,16 @@ impl Constructor for PyType {
};
let qualname = dict
.pop_item(identifier!(vm, __qualname__).as_object(), vm)?
.get_item_opt(identifier!(vm, __qualname__), vm)?
.map(|obj| downcast_qualname(obj, vm))
.transpose()?
.unwrap_or_else(|| {
// If __qualname__ is not provided, we can use the name as default
name.clone().into_wtf8()
});
let mut attributes = dict.to_attributes(vm);
attributes.shift_remove(identifier!(vm, __qualname__));
// Check __doc__ for surrogates - raises UnicodeEncodeError during type creation
if let Some(doc) = attributes.get(identifier!(vm, __doc__))
@@ -2133,15 +2135,29 @@ impl Constructor for PyType {
}
}
if let Some(cell) = typ.attributes.write().get(identifier!(vm, __classcell__)) {
let cell = PyCellRef::try_from_object(vm, cell.clone()).map_err(|_| {
vm.new_type_error(format!(
"__classcell__ must be a nonlocal cell, not {}",
cell.class().name()
))
})?;
cell.set(Some(typ.clone().into()));
};
{
let mut attrs = typ.attributes.write();
if let Some(cell) = attrs.get(identifier!(vm, __classcell__)) {
let cell = PyCellRef::try_from_object(vm, cell.clone()).map_err(|_| {
vm.new_type_error(format!(
"__classcell__ must be a nonlocal cell, not {}",
cell.class().name()
))
})?;
cell.set(Some(typ.clone().into()));
attrs.shift_remove(identifier!(vm, __classcell__));
}
if let Some(cell) = attrs.get(identifier!(vm, __classdictcell__)) {
let cell = PyCellRef::try_from_object(vm, cell.clone()).map_err(|_| {
vm.new_type_error(format!(
"__classdictcell__ must be a nonlocal cell, not {}",
cell.class().name()
))
})?;
cell.set(Some(dict.clone().into()));
attrs.shift_remove(identifier!(vm, __classdictcell__));
}
}
// All *classes* should have a dict. Exceptions are *instances* of
// classes that define __slots__ and instances of built-in classes

View File

@@ -237,6 +237,18 @@ impl PyBytesInner {
vm.new_overflow_error("bytes object is too large to make repr")
}
pub(crate) fn warn_on_str(message: &'static str, vm: &VirtualMachine) -> PyResult<()> {
if vm.state.config.settings.bytes_warning > 0 {
crate::stdlib::_warnings::warn(
vm.ctx.exceptions.bytes_warning,
message.to_owned(),
1,
vm,
)?;
}
Ok(())
}
pub fn repr_with_name(&self, class_name: &str, vm: &VirtualMachine) -> PyResult<String> {
const DECORATION_LEN: isize = 2 + 3; // 2 for (), 3 for b"" => bytearray(b"")
let escape = crate::literal::escape::AsciiEscape::new_repr(&self.elements);

View File

@@ -2397,6 +2397,22 @@ pub(super) mod types {
.downcast::<crate::builtins::PyTuple>()
{
let location_tup_len = location_tuple.len();
match location_tup_len {
4 | 6 => {}
5 => {
return Err(vm.new_type_error(
"end_offset must be provided when end_lineno is provided".to_owned(),
));
}
_ => {
return Err(vm.new_type_error(format!(
"function takes exactly 4 or 6 arguments ({} given)",
location_tup_len
)));
}
}
for (i, &attr) in [
"filename",
"lineno",

View File

@@ -10,7 +10,6 @@ use crate::{
PyBaseException, PyBaseExceptionRef, PyBaseObject, PyCode, PyCoroutine, PyDict, PyDictRef,
PyFloat, PyFrozenSet, PyGenerator, PyInt, PyInterpolation, PyList, PyModule, PyProperty,
PySet, PySlice, PyStr, PyStrInterned, PyTemplate, PyTraceback, PyType, PyUtf8Str,
asyncgenerator::PyAsyncGenWrappedValue,
builtin_func::PyNativeFunction,
descriptor::{MemberGetter, PyMemberDescriptor, PyMethodDescriptor},
frame::stack_analysis,
@@ -41,7 +40,6 @@ use crate::{
use alloc::fmt;
use bstr::ByteSlice;
use core::cell::UnsafeCell;
use core::iter::zip;
use core::sync::atomic;
use core::sync::atomic::AtomicPtr;
use core::sync::atomic::Ordering::{Acquire, Relaxed};
@@ -684,14 +682,7 @@ impl Frame {
use_datastack: bool,
vm: &VirtualMachine,
) -> Self {
let nlocals = code.varnames.len();
let num_cells = code.cellvars.len();
let nfrees = closure.len();
let nlocalsplus = nlocals
.checked_add(num_cells)
.and_then(|v| v.checked_add(nfrees))
.expect("Frame::new: nlocalsplus overflow");
let nlocalsplus = code.localspluskinds.len();
let max_stackdepth = code.max_stackdepth as usize;
let mut localsplus = if use_datastack {
LocalsPlus::new_on_datastack(nlocalsplus, max_stackdepth, vm)
@@ -699,15 +690,32 @@ impl Frame {
LocalsPlus::new(nlocalsplus, max_stackdepth)
};
// Store cell/free variable objects directly in localsplus
let fastlocals = localsplus.fastlocals_mut();
for i in 0..num_cells {
fastlocals[nlocals + i] = Some(PyCell::default().into_ref(&vm.ctx).into());
}
for (i, cell) in closure.iter().enumerate() {
fastlocals[nlocals + num_cells + i] = Some(cell.clone().into());
// Pre-copy closure cells into free var slots so that locals() works
// even before COPY_FREE_VARS runs (e.g. coroutine before first send).
// COPY_FREE_VARS will overwrite these on first execution.
{
let nfrees = code.freevars.len();
if nfrees > 0 {
let freevar_start = nlocalsplus - nfrees;
let fastlocals = localsplus.fastlocals_mut();
for (i, cell) in closure.iter().enumerate() {
fastlocals[freevar_start + i] = Some(cell.clone().into());
}
}
}
// For generators/coroutines, initialize prev_line to the def line
// so that preamble instructions (RETURN_GENERATOR, POP_TOP) don't
// fire spurious LINE events.
let prev_line = if code
.flags
.intersects(bytecode::CodeFlags::GENERATOR | bytecode::CodeFlags::COROUTINE)
{
code.first_line_number.map_or(0, |line| line.get() as u32)
} else {
0
};
let iframe = InterpreterFrame {
localsplus,
locals: match scope.locals {
@@ -722,7 +730,7 @@ impl Frame {
code,
func_obj,
lasti: Radium::new(0),
prev_line: 0,
prev_line,
trace: PyMutex::new(vm.ctx.none()),
trace_lines: PyMutex::new(true),
trace_opcodes: PyMutex::new(false),
@@ -791,30 +799,17 @@ impl Frame {
}
}
/// Get cell contents by cell index. Reads through fastlocals (no state lock needed).
pub(crate) fn get_cell_contents(&self, cell_idx: usize) -> Option<PyObjectRef> {
let nlocals = self.code.varnames.len();
/// Get cell contents by localsplus index.
pub(crate) fn get_cell_contents(&self, localsplus_idx: usize) -> Option<PyObjectRef> {
// SAFETY: Frame not executing; no concurrent mutation.
let fastlocals = unsafe { (*self.iframe.get()).localsplus.fastlocals() };
fastlocals
.get(nlocals + cell_idx)
.get(localsplus_idx)
.and_then(|slot| slot.as_ref())
.and_then(|obj| obj.downcast_ref::<PyCell>())
.and_then(|cell| cell.get())
}
/// Set cell contents by cell index. Only safe to call before frame execution starts.
pub(crate) fn set_cell_contents(&self, cell_idx: usize, value: Option<PyObjectRef>) {
let nlocals = self.code.varnames.len();
// SAFETY: Called before frame execution starts.
let fastlocals = unsafe { (*self.iframe.get()).localsplus.fastlocals() };
fastlocals[nlocals + cell_idx]
.as_ref()
.and_then(|obj| obj.downcast_ref::<PyCell>())
.expect("cell slot empty or not a PyCell")
.set(value);
}
/// Store a borrowed back-reference to the owning generator/coroutine.
/// The caller must ensure the generator outlives the frame.
pub fn set_generator(&self, generator: &PyObject) {
@@ -888,41 +883,102 @@ impl Frame {
}
pub fn locals(&self, vm: &VirtualMachine) -> PyResult<ArgMapping> {
use rustpython_compiler_core::bytecode::{
CO_FAST_CELL, CO_FAST_FREE, CO_FAST_HIDDEN, CO_FAST_LOCAL,
};
// SAFETY: Either the frame is not executing (caller checked owner),
// or we're in a trace callback on the same thread that's executing.
let locals = &self.locals;
let code = &**self.code;
let map = &code.varnames;
let j = core::cmp::min(map.len(), code.varnames.len());
let locals_map = locals.mapping(vm);
if !code.varnames.is_empty() {
let fastlocals = unsafe { (*self.iframe.get()).localsplus.fastlocals() };
for (&k, v) in zip(&map[..j], fastlocals) {
match locals_map.ass_subscript(k, v.clone(), vm) {
Ok(()) => {}
Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => {}
Err(e) => return Err(e),
let fastlocals = unsafe { (*self.iframe.get()).localsplus.fastlocals() };
// Iterate through all localsplus slots using localspluskinds
let nlocalsplus = code.localspluskinds.len();
let nfrees = code.freevars.len();
let free_start = nlocalsplus - nfrees;
let is_optimized = code.flags.contains(bytecode::CodeFlags::OPTIMIZED);
// Track which non-merged cellvar index we're at
let mut nonmerged_cell_idx = 0;
for (i, &kind) in code.localspluskinds.iter().enumerate() {
if kind & CO_FAST_HIDDEN != 0 {
// Hidden variables are only skipped when their slot is empty.
// After a comprehension restores values, they should appear in locals().
let slot_empty = match fastlocals[i].as_ref() {
None => true,
Some(obj) => {
if kind & (CO_FAST_CELL | CO_FAST_FREE) != 0 {
// If it's a PyCell, check if the cell is empty.
// If it's a raw value (merged cell during inlined comp), not empty.
obj.downcast_ref::<PyCell>()
.is_some_and(|cell| cell.get().is_none())
} else {
false
}
}
};
if slot_empty {
continue;
}
}
}
if !code.cellvars.is_empty() || !code.freevars.is_empty() {
for (i, &k) in code.cellvars.iter().enumerate() {
let cell_value = self.get_cell_contents(i);
match locals_map.ass_subscript(k, cell_value, vm) {
Ok(()) => {}
Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => {}
Err(e) => return Err(e),
}
// Free variables only included for optimized (function-like) scopes.
// Class/module scopes should not expose free vars in locals().
if kind == CO_FAST_FREE && !is_optimized {
continue;
}
if code.flags.contains(bytecode::CodeFlags::OPTIMIZED) {
for (i, &k) in code.freevars.iter().enumerate() {
let cell_value = self.get_cell_contents(code.cellvars.len() + i);
match locals_map.ass_subscript(k, cell_value, vm) {
Ok(()) => {}
Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => {}
Err(e) => return Err(e),
// Get the name for this slot
let name = if kind & CO_FAST_LOCAL != 0 {
code.varnames[i]
} else if kind & CO_FAST_FREE != 0 {
code.freevars[i - free_start]
} else if kind & CO_FAST_CELL != 0 {
// Non-merged cell: find the name by skipping merged cellvars
let mut found_name = None;
let mut skip = nonmerged_cell_idx;
for cv in code.cellvars.iter() {
let is_merged = code.varnames.contains(cv);
if !is_merged {
if skip == 0 {
found_name = Some(*cv);
break;
}
skip -= 1;
}
}
nonmerged_cell_idx += 1;
match found_name {
Some(n) => n,
None => continue,
}
} else {
continue;
};
// Get the value
let value = if kind & (CO_FAST_CELL | CO_FAST_FREE) != 0 {
// Cell or free var: extract value from PyCell.
// During inlined comprehensions, a merged cell slot may hold a raw
// value (not a PyCell) after LOAD_FAST_AND_CLEAR + STORE_FAST.
fastlocals[i].as_ref().and_then(|obj| {
if let Some(cell) = obj.downcast_ref::<PyCell>() {
cell.get()
} else {
Some(obj.clone())
}
})
} else {
// Regular local
fastlocals[i].clone()
};
match locals_map.ass_subscript(name, value, vm) {
Ok(()) => {}
Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => {}
Err(e) => return Err(e),
}
}
Ok(locals.clone_mapping(vm))
@@ -1325,13 +1381,12 @@ impl ExecutingFrame<'_> {
self.lasti.load(Relaxed)
}
/// Access the PyCellRef at the given cell/free variable index.
/// `cell_idx` is 0-based: 0..ncells for cellvars, ncells.. for freevars.
/// Access the PyCellRef at the given localsplus index.
#[inline(always)]
fn cell_ref(&self, cell_idx: usize) -> &PyCell {
let nlocals = self.code.varnames.len();
self.localsplus.fastlocals()[nlocals + cell_idx]
.as_ref()
fn cell_ref(&self, localsplus_idx: usize) -> &PyCell {
let fastlocals = self.localsplus.fastlocals();
let slot = &fastlocals[localsplus_idx];
slot.as_ref()
.expect("cell slot empty")
.downcast_ref::<PyCell>()
.expect("cell slot is not a PyCell")
@@ -1700,12 +1755,6 @@ impl ExecutingFrame<'_> {
exc_tb: PyObjectRef,
) -> PyResult<ExecutionResult> {
self.monitoring_mask = vm.state.monitoring_events.load();
// Reset prev_line so that LINE monitoring events fire even if
// the exception handler is on the same line as the yield point.
// In CPython, _Py_call_instrumentation_line has a special case
// for RESUME: it fires LINE even when prev_line == current_line.
// Since gen_throw bypasses RESUME, we reset prev_line instead.
*self.prev_line = 0;
if let Some(jen) = self.yield_from_target() {
// Check if the exception is GeneratorExit (type or instance).
// For GeneratorExit, close the sub-iterator instead of throwing.
@@ -1741,7 +1790,10 @@ impl ExecutingFrame<'_> {
self.push_value(vm.ctx.none());
vm.contextualize_exception(&err);
return match self.unwind_blocks(vm, UnwindReason::Raising { exception: err }) {
Ok(None) => self.run(vm),
Ok(None) => {
*self.prev_line = 0;
self.run(vm)
}
Ok(Some(result)) => Ok(result),
Err(exception) => Err(exception),
};
@@ -1783,7 +1835,10 @@ impl ExecutingFrame<'_> {
self.push_value(vm.ctx.none());
vm.contextualize_exception(&err);
match self.unwind_blocks(vm, UnwindReason::Raising { exception: err }) {
Ok(None) => self.run(vm),
Ok(None) => {
*self.prev_line = 0;
self.run(vm)
}
Ok(Some(result)) => Ok(result),
Err(exception) => Err(exception),
}
@@ -1851,7 +1906,13 @@ impl ExecutingFrame<'_> {
self.push_value(vm.ctx.none());
match self.unwind_blocks(vm, UnwindReason::Raising { exception }) {
Ok(None) => self.run(vm),
Ok(None) => {
// Reset prev_line so that the first instruction in the handler
// fires a LINE event. In CPython, gen_send_ex re-enters the
// eval loop which reinitializes its local prev_instr tracker.
*self.prev_line = 0;
self.run(vm)
}
Ok(Some(result)) => Ok(result),
Err(exception) => {
// Fire PY_UNWIND: exception escapes the generator frame.
@@ -1871,18 +1932,72 @@ impl ExecutingFrame<'_> {
}
}
fn unbound_cell_exception(&self, i: usize, vm: &VirtualMachine) -> PyBaseExceptionRef {
if let Some(&name) = self.code.cellvars.get(i) {
vm.new_exception_msg(
vm.ctx.exceptions.unbound_local_error.to_owned(),
format!("local variable '{name}' referenced before assignment").into(),
)
} else {
let name = self.code.freevars[i - self.code.cellvars.len()];
fn unbound_cell_exception(
&self,
localsplus_idx: usize,
vm: &VirtualMachine,
) -> PyBaseExceptionRef {
use rustpython_compiler_core::bytecode::CO_FAST_FREE;
let kind = self
.code
.localspluskinds
.get(localsplus_idx)
.copied()
.unwrap_or(0);
if kind & CO_FAST_FREE != 0 {
let name = self.localsplus_name(localsplus_idx);
vm.new_name_error(
format!("cannot access free variable '{name}' where it is not associated with a value in enclosing scope"),
name.to_owned(),
)
} else {
// Both merged cells (LOCAL|CELL) and non-merged cells get unbound local error
let name = self.localsplus_name(localsplus_idx);
vm.new_exception_msg(
vm.ctx.exceptions.unbound_local_error.to_owned(),
format!("local variable '{name}' referenced before assignment").into(),
)
}
}
/// Get the variable name for a localsplus index.
fn localsplus_name(&self, idx: usize) -> &'static PyStrInterned {
use rustpython_compiler_core::bytecode::{CO_FAST_CELL, CO_FAST_FREE, CO_FAST_LOCAL};
let nlocals = self.code.varnames.len();
let kind = self.code.localspluskinds.get(idx).copied().unwrap_or(0);
if kind & CO_FAST_LOCAL != 0 {
// Merged cell or regular local: name is in varnames
self.code.varnames[idx]
} else if kind & CO_FAST_FREE != 0 {
// Free var: slots are at the end of localsplus
let nlocalsplus = self.code.localspluskinds.len();
let nfrees = self.code.freevars.len();
let free_start = nlocalsplus - nfrees;
self.code.freevars[idx - free_start]
} else if kind & CO_FAST_CELL != 0 {
// Non-merged cell: count how many non-merged cell slots are before
// this index to find the corresponding cellvars entry.
// Non-merged cellvars appear in their original order (skipping merged ones).
let nonmerged_pos = self.code.localspluskinds[nlocals..idx]
.iter()
.filter(|&&k| k == CO_FAST_CELL)
.count();
// Skip merged cellvars to find the right one
let mut cv_idx = 0;
let mut nonmerged_count = 0;
for (i, name) in self.code.cellvars.iter().enumerate() {
let is_merged = self.code.varnames.contains(name);
if !is_merged {
if nonmerged_count == nonmerged_pos {
cv_idx = i;
break;
}
nonmerged_count += 1;
}
}
self.code.cellvars[cv_idx]
} else {
self.code.varnames[idx]
}
}
@@ -2153,13 +2268,29 @@ impl ExecutingFrame<'_> {
self.push_stackref_opt(value);
Ok(None)
}
Instruction::CopyFreeVars { .. } => {
// Free vars are already set up at frame creation time in RustPython
Instruction::CopyFreeVars { n } => {
let n = n.get(arg) as usize;
if n > 0 {
let closure = self
.object
.func_obj
.as_ref()
.and_then(|f| f.downcast_ref::<PyFunction>())
.and_then(|f| f.closure.as_ref());
let nlocalsplus = self.code.localspluskinds.len();
let freevar_start = nlocalsplus - n;
let fastlocals = self.localsplus.fastlocals_mut();
if let Some(closure) = closure {
for i in 0..n {
fastlocals[freevar_start + i] = Some(closure[i].clone().into());
}
}
}
Ok(None)
}
Instruction::DeleteAttr { namei: idx } => self.delete_attr(vm, idx.get(arg)),
Instruction::DeleteDeref { i } => {
self.cell_ref(i.get(arg) as usize).set(None);
self.cell_ref(i.get(arg).as_usize()).set(None);
Ok(None)
}
Instruction::DeleteFast { var_num } => {
@@ -2311,7 +2442,7 @@ impl ExecutingFrame<'_> {
}
Instruction::ForIter { .. } => {
// Relative forward jump: target = lasti + caches + delta
let target = bytecode::Label::new(self.lasti() + 1 + u32::from(arg));
let target = bytecode::Label::from_u32(self.lasti() + 1 + u32::from(arg));
self.adaptive(|s, ii, cb| s.specialize_for_iter(vm, u32::from(arg), ii, cb));
self.execute_for_iter(vm, target)?;
Ok(None)
@@ -2509,7 +2640,7 @@ impl ExecutingFrame<'_> {
}
Instruction::ListAppend { i } => {
let item = self.pop_value();
let obj = self.nth_value(i.get(arg));
let obj = self.nth_value(i.get(arg) - 1);
let list: &Py<PyList> = unsafe {
// SAFETY: trust compiler
obj.downcast_unchecked_ref()
@@ -2519,7 +2650,7 @@ impl ExecutingFrame<'_> {
}
Instruction::ListExtend { i } => {
let iterable = self.pop_value();
let obj = self.nth_value(i.get(arg));
let obj = self.nth_value(i.get(arg) - 1);
let list: &Py<PyList> = unsafe {
// SAFETY: compiler guarantees correct type
obj.downcast_unchecked_ref()
@@ -2585,12 +2716,8 @@ impl ExecutingFrame<'_> {
Instruction::LoadFromDictOrDeref { i } => {
// Pop dict from stack (locals or classdict depending on context)
let class_dict = self.pop_value();
let i = i.get(arg) as usize;
let name = if i < self.code.cellvars.len() {
self.code.cellvars[i]
} else {
self.code.freevars[i - self.code.cellvars.len()]
};
let idx = i.get(arg).as_usize();
let name = self.localsplus_name(idx);
// Only treat KeyError as "not found", propagate other exceptions
let value = if let Some(dict_obj) = class_dict.downcast_ref::<PyDict>() {
dict_obj.get_item_opt(name, vm)?
@@ -2604,9 +2731,9 @@ impl ExecutingFrame<'_> {
self.push_value(match value {
Some(v) => v,
None => self
.cell_ref(i)
.cell_ref(idx)
.get()
.ok_or_else(|| self.unbound_cell_exception(i, vm))?,
.ok_or_else(|| self.unbound_cell_exception(idx, vm))?,
});
Ok(None)
}
@@ -2659,8 +2786,18 @@ impl ExecutingFrame<'_> {
vm.ctx.exceptions.not_implemented_error.to_owned().into()
}
CommonConstant::BuiltinTuple => vm.ctx.types.tuple_type.to_owned().into(),
CommonConstant::BuiltinAll => vm.builtins.get_attr("all", vm)?,
CommonConstant::BuiltinAny => vm.builtins.get_attr("any", vm)?,
CommonConstant::BuiltinAll => vm
.callable_cache
.builtin_all
.clone()
.expect("builtin_all not initialized"),
CommonConstant::BuiltinAny => vm
.callable_cache
.builtin_any
.clone()
.expect("builtin_any not initialized"),
CommonConstant::BuiltinList => vm.ctx.types.list_type.to_owned().into(),
CommonConstant::BuiltinSet => vm.ctx.types.set_type.to_owned().into(),
};
self.push_value(value);
Ok(None)
@@ -2672,7 +2809,7 @@ impl ExecutingFrame<'_> {
Ok(None)
}
Instruction::LoadDeref { i } => {
let idx = i.get(arg) as usize;
let idx = i.get(arg).as_usize();
let x = self
.cell_ref(idx)
.get()
@@ -2699,13 +2836,12 @@ impl ExecutingFrame<'_> {
Ok(None)
}
Instruction::LoadFastAndClear { var_num } => {
// Load value and clear the slot (for inlined comprehensions)
// If slot is empty, push None (not an error - variable may not exist yet)
// Save current slot value and clear it (for inlined comprehensions).
// Pushes NULL (None at Option level) if slot was empty, so that
// StoreFast can restore the empty state after the comprehension.
let idx = var_num.get(arg);
let x = self.localsplus.fastlocals_mut()[idx]
.take()
.unwrap_or_else(|| vm.ctx.none());
self.push_value(x);
let x = self.localsplus.fastlocals_mut()[idx].take();
self.push_value_opt(x);
Ok(None)
}
Instruction::LoadFastCheck { var_num } => {
@@ -2825,23 +2961,23 @@ impl ExecutingFrame<'_> {
Ok(None)
}
Instruction::LoadSpecial { method } => {
// Stack effect: 0 (replaces TOS with bound method)
// Input: [..., obj]
// Output: [..., bound_method]
// Pops obj, pushes (callable, self_or_null) for CALL convention.
// Push order: callable first (deeper), self_or_null on top.
use crate::vm::PyMethod;
let obj = self.pop_value();
let oparg = method.get(arg);
let method_name = get_special_method_name(oparg, vm);
let bound = match vm.get_special_method(&obj, method_name)? {
match vm.get_special_method(&obj, method_name)? {
Some(PyMethod::Function { target, func }) => {
// Create bound method: PyBoundMethod(object=target, function=func)
crate::builtins::PyBoundMethod::new(target, func)
.into_ref(&vm.ctx)
.into()
self.push_value(func); // callable (deeper)
self.push_value(target); // self (TOS)
}
Some(PyMethod::Attribute(bound)) => {
self.push_value(bound); // callable (deeper)
self.push_null(); // NULL (TOS)
}
Some(PyMethod::Attribute(bound)) => bound,
None => {
return Err(vm.new_type_error(get_special_method_error_msg(
oparg,
@@ -2850,18 +2986,24 @@ impl ExecutingFrame<'_> {
)));
}
};
self.push_value(bound);
Ok(None)
}
Instruction::MakeFunction => self.execute_make_function(vm),
Instruction::MakeCell { .. } => {
// Cell creation is handled at frame creation time in RustPython
Instruction::MakeCell { i } => {
// Wrap the current slot value (if any) in a new PyCell.
// For merged cells (LOCAL|CELL), this wraps the argument value.
// For non-merged cells, this creates an empty cell.
let idx = i.get(arg).as_usize();
let fastlocals = self.localsplus.fastlocals_mut();
let initial = fastlocals[idx].take();
let cell = PyCell::new(initial).into_ref(&vm.ctx).into();
fastlocals[idx] = Some(cell);
Ok(None)
}
Instruction::MapAdd { i } => {
let value = self.pop_value();
let key = self.pop_value();
let obj = self.nth_value(i.get(arg));
let obj = self.nth_value(i.get(arg) - 1);
let dict: &Py<PyDict> = unsafe {
// SAFETY: trust compiler
obj.downcast_unchecked_ref()
@@ -3192,7 +3334,7 @@ impl ExecutingFrame<'_> {
}
Instruction::SetAdd { i } => {
let item = self.pop_value();
let obj = self.nth_value(i.get(arg));
let obj = self.nth_value(i.get(arg) - 1);
let set: &Py<PySet> = unsafe {
// SAFETY: trust compiler
obj.downcast_unchecked_ref()
@@ -3202,7 +3344,7 @@ impl ExecutingFrame<'_> {
}
Instruction::SetUpdate { i } => {
let iterable = self.pop_value();
let obj = self.nth_value(i.get(arg));
let obj = self.nth_value(i.get(arg) - 1);
let set: &Py<PySet> = unsafe {
// SAFETY: compiler guarantees correct type
obj.downcast_unchecked_ref()
@@ -3294,13 +3436,14 @@ impl ExecutingFrame<'_> {
}
Instruction::StoreDeref { i } => {
let value = self.pop_value();
self.cell_ref(i.get(arg) as usize).set(Some(value));
self.cell_ref(i.get(arg).as_usize()).set(Some(value));
Ok(None)
}
Instruction::StoreFast { var_num } => {
let value = self.pop_value();
// pop_value_opt: allows NULL from LoadFastAndClear restore path
let value = self.pop_value_opt();
let fastlocals = self.localsplus.fastlocals_mut();
fastlocals[var_num.get(arg)] = Some(value);
fastlocals[var_num.get(arg)] = value;
Ok(None)
}
Instruction::StoreFastLoadFast { var_nums } => {
@@ -3318,11 +3461,12 @@ impl ExecutingFrame<'_> {
Instruction::StoreFastStoreFast { var_nums } => {
let oparg = var_nums.get(arg);
let (idx1, idx2) = oparg.indexes();
let value1 = self.pop_value();
let value2 = self.pop_value();
// pop_value_opt: allows NULL from LoadFastAndClear restore path
let value1 = self.pop_value_opt();
let value2 = self.pop_value_opt();
let fastlocals = self.localsplus.fastlocals_mut();
fastlocals[idx1] = Some(value1);
fastlocals[idx2] = Some(value2);
fastlocals[idx1] = value1;
fastlocals[idx2] = value2;
Ok(None)
}
Instruction::StoreGlobal { namei: idx } => {
@@ -3393,29 +3537,33 @@ impl ExecutingFrame<'_> {
self.unpack_sequence(expected, vm)
}
Instruction::WithExceptStart => {
// Stack: [..., __exit__, lasti, prev_exc, exc]
// Call __exit__(type, value, tb) and push result
// __exit__ is at TOS-3 (below lasti, prev_exc, and exc)
// Stack: [..., exit_func, self_or_null, lasti, prev_exc, exc]
// exit_func at TOS-4, self_or_null at TOS-3
let exc = vm.current_exception();
let stack_len = self.localsplus.stack_len();
let exit = expect_unchecked(
self.localsplus.stack_index(stack_len - 4).clone(),
"WithExceptStart: __exit__ is NULL",
let exit_func = expect_unchecked(
self.localsplus.stack_index(stack_len - 5).clone(),
"WithExceptStart: exit_func is NULL",
);
let self_or_null = self.localsplus.stack_index(stack_len - 4).clone();
let args = if let Some(ref exc) = exc {
let (tp, val, tb) = if let Some(ref exc) = exc {
vm.split_exception(exc.clone())
} else {
(vm.ctx.none(), vm.ctx.none(), vm.ctx.none())
};
let exit_res = exit.call(args, vm)?;
// Push result on top of stack
let exit_res = if let Some(self_exit) = self_or_null {
exit_func.call((self_exit.to_pyobj(), tp, val, tb), vm)?
} else {
exit_func.call((tp, val, tb), vm)?
};
self.push_value(exit_res);
Ok(None)
}
Instruction::YieldValue { arg: oparg } => {
Instruction::YieldValue { .. } => {
debug_assert!(
self.localsplus
.stack_as_slice()
@@ -3424,21 +3572,12 @@ impl ExecutingFrame<'_> {
.all(|sr| !sr.is_borrowed()),
"borrowed refs on stack at yield point"
);
let value = self.pop_value();
// arg=0: direct yield (wrapped for async generators)
// arg=1: yield from await/yield-from (NOT wrapped)
let wrap = oparg.get(arg) == 0;
let value = if wrap && self.code.flags.contains(bytecode::CodeFlags::COROUTINE) {
PyAsyncGenWrappedValue(value).into_pyobject(vm)
} else {
value
};
Ok(Some(ExecutionResult::Yield(value)))
Ok(Some(ExecutionResult::Yield(self.pop_value())))
}
Instruction::Send { .. } => {
// (receiver, v -- receiver, retval)
self.adaptive(|s, ii, cb| s.specialize_send(vm, ii, cb));
let exit_label = bytecode::Label::new(self.lasti() + 1 + u32::from(arg));
let exit_label = bytecode::Label::from_u32(self.lasti() + 1 + u32::from(arg));
let receiver = self.nth_value(1);
let can_fast_send = !self.specialization_eval_frame_active(vm)
&& (receiver.downcast_ref_if_exact::<PyGenerator>(vm).is_some()
@@ -3476,7 +3615,7 @@ impl ExecutingFrame<'_> {
}
}
Instruction::SendGen => {
let exit_label = bytecode::Label::new(self.lasti() + 1 + u32::from(arg));
let exit_label = bytecode::Label::from_u32(self.lasti() + 1 + u32::from(arg));
// Stack: [receiver, val] — peek receiver before popping
let receiver = self.nth_value(1);
let can_fast_send = !self.specialization_eval_frame_active(vm)
@@ -3607,7 +3746,7 @@ impl ExecutingFrame<'_> {
}
// Specialized LOAD_ATTR opcodes
Instruction::LoadAttrMethodNoDict => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let owner = self.top_value();
@@ -3626,7 +3765,7 @@ impl ExecutingFrame<'_> {
}
}
Instruction::LoadAttrMethodLazyDict => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let owner = self.top_value();
@@ -3646,7 +3785,7 @@ impl ExecutingFrame<'_> {
}
}
Instruction::LoadAttrMethodWithValues => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let attr_name = self.code.names[oparg.name_idx() as usize];
@@ -3681,7 +3820,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrInstanceValue => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let attr_name = self.code.names[oparg.name_idx() as usize];
@@ -3703,7 +3842,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrWithHint => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let attr_name = self.code.names[oparg.name_idx() as usize];
@@ -3728,7 +3867,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrModule => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let attr_name = self.code.names[oparg.name_idx() as usize];
@@ -3752,7 +3891,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrNondescriptorNoDict => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let owner = self.top_value();
@@ -3774,7 +3913,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrNondescriptorWithValues => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let attr_name = self.code.names[oparg.name_idx() as usize];
@@ -3812,7 +3951,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrClass => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let owner = self.top_value();
@@ -3835,7 +3974,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrClassWithMetaclassCheck => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let owner = self.top_value();
@@ -3861,7 +4000,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrGetattributeOverridden => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let owner = self.top_value();
let type_version = self.code.instructions.read_cache_u32(cache_base + 1);
@@ -3888,7 +4027,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrSlot => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let owner = self.top_value();
@@ -3912,7 +4051,7 @@ impl ExecutingFrame<'_> {
self.load_attr_slow(vm, oparg)
}
Instruction::LoadAttrProperty => {
let oparg = LoadAttr::new(u32::from(arg));
let oparg = LoadAttr::from_u32(u32::from(arg));
let cache_base = self.lasti() as usize;
let owner = self.top_value();
@@ -5114,7 +5253,7 @@ impl ExecutingFrame<'_> {
return Ok(None);
}
}
let oparg = LoadSuperAttr::new(oparg);
let oparg = LoadSuperAttr::from_u32(oparg);
self.load_super_attr(vm, oparg)
}
Instruction::LoadSuperAttrMethod => {
@@ -5181,7 +5320,7 @@ impl ExecutingFrame<'_> {
return Ok(None);
}
}
let oparg = LoadSuperAttr::new(oparg);
let oparg = LoadSuperAttr::from_u32(oparg);
self.load_super_attr(vm, oparg)
}
Instruction::CompareOpInt => {
@@ -5448,7 +5587,7 @@ impl ExecutingFrame<'_> {
self.unpack_sequence(size as u32, vm)
}
Instruction::ForIterRange => {
let target = bytecode::Label::new(self.lasti() + 1 + u32::from(arg));
let target = bytecode::Label::from_u32(self.lasti() + 1 + u32::from(arg));
let iter = self.top_value();
if let Some(range_iter) = iter.downcast_ref_if_exact::<PyRangeIterator>(vm) {
if let Some(value) = range_iter.fast_next() {
@@ -5463,7 +5602,7 @@ impl ExecutingFrame<'_> {
}
}
Instruction::ForIterList => {
let target = bytecode::Label::new(self.lasti() + 1 + u32::from(arg));
let target = bytecode::Label::from_u32(self.lasti() + 1 + u32::from(arg));
let iter = self.top_value();
if let Some(list_iter) = iter.downcast_ref_if_exact::<PyListIterator>(vm) {
if let Some(value) = list_iter.fast_next() {
@@ -5478,7 +5617,7 @@ impl ExecutingFrame<'_> {
}
}
Instruction::ForIterTuple => {
let target = bytecode::Label::new(self.lasti() + 1 + u32::from(arg));
let target = bytecode::Label::from_u32(self.lasti() + 1 + u32::from(arg));
let iter = self.top_value();
if let Some(tuple_iter) = iter.downcast_ref_if_exact::<PyTupleIterator>(vm) {
if let Some(value) = tuple_iter.fast_next() {
@@ -5493,7 +5632,7 @@ impl ExecutingFrame<'_> {
}
}
Instruction::ForIterGen => {
let target = bytecode::Label::new(self.lasti() + 1 + u32::from(arg));
let target = bytecode::Label::from_u32(self.lasti() + 1 + u32::from(arg));
let iter = self.top_value();
if self.specialization_eval_frame_active(vm) {
self.execute_for_iter(vm, target)?;
@@ -5667,13 +5806,6 @@ impl ExecutingFrame<'_> {
let offset = (self.lasti() - 1) * 2;
monitoring::fire_py_yield(vm, self.code, offset, &value)?;
}
let oparg = u32::from(arg);
let wrap = oparg == 0;
let value = if wrap && self.code.flags.contains(bytecode::CodeFlags::COROUTINE) {
PyAsyncGenWrappedValue(value).into_pyobject(vm)
} else {
value
};
Ok(Some(ExecutionResult::Yield(value)))
}
Instruction::InstrumentedCall => {
@@ -5735,7 +5867,7 @@ impl ExecutingFrame<'_> {
Instruction::InstrumentedJumpForward => {
let src_offset = (self.lasti() - 1) * 2;
let target_idx = self.lasti() + u32::from(arg);
let target = bytecode::Label::new(target_idx);
let target = bytecode::Label::from_u32(target_idx);
self.jump(target);
if self.monitoring_mask & monitoring::EVENT_JUMP != 0 {
monitoring::fire_jump(vm, self.code, src_offset, target.as_u32() * 2)?;
@@ -5745,7 +5877,7 @@ impl ExecutingFrame<'_> {
Instruction::InstrumentedJumpBackward => {
let src_offset = (self.lasti() - 1) * 2;
let target_idx = self.lasti() + 1 - u32::from(arg);
let target = bytecode::Label::new(target_idx);
let target = bytecode::Label::from_u32(target_idx);
self.jump(target);
if self.monitoring_mask & monitoring::EVENT_JUMP != 0 {
monitoring::fire_jump(vm, self.code, src_offset, target.as_u32() * 2)?;
@@ -5754,7 +5886,7 @@ impl ExecutingFrame<'_> {
}
Instruction::InstrumentedForIter => {
let src_offset = (self.lasti() - 1) * 2;
let target = bytecode::Label::new(self.lasti() + 1 + u32::from(arg));
let target = bytecode::Label::from_u32(self.lasti() + 1 + u32::from(arg));
let continued = self.execute_for_iter(vm, target)?;
if continued {
if self.monitoring_mask & monitoring::EVENT_BRANCH_LEFT != 0 {
@@ -5804,7 +5936,7 @@ impl ExecutingFrame<'_> {
let obj = self.pop_value();
let value = obj.try_to_bool(vm)?;
if value {
self.jump(bytecode::Label::new(target_idx));
self.jump(bytecode::Label::from_u32(target_idx));
if self.monitoring_mask & monitoring::EVENT_BRANCH_RIGHT != 0 {
monitoring::fire_branch_right(vm, self.code, src_offset, target_idx * 2)?;
}
@@ -5817,7 +5949,7 @@ impl ExecutingFrame<'_> {
let obj = self.pop_value();
let value = obj.try_to_bool(vm)?;
if !value {
self.jump(bytecode::Label::new(target_idx));
self.jump(bytecode::Label::from_u32(target_idx));
if self.monitoring_mask & monitoring::EVENT_BRANCH_RIGHT != 0 {
monitoring::fire_branch_right(vm, self.code, src_offset, target_idx * 2)?;
}
@@ -5829,7 +5961,7 @@ impl ExecutingFrame<'_> {
let target_idx = self.lasti() + 1 + u32::from(arg);
let value = self.pop_value();
if vm.is_none(&value) {
self.jump(bytecode::Label::new(target_idx));
self.jump(bytecode::Label::from_u32(target_idx));
if self.monitoring_mask & monitoring::EVENT_BRANCH_RIGHT != 0 {
monitoring::fire_branch_right(vm, self.code, src_offset, target_idx * 2)?;
}
@@ -5841,7 +5973,7 @@ impl ExecutingFrame<'_> {
let target_idx = self.lasti() + 1 + u32::from(arg);
let value = self.pop_value();
if !vm.is_none(&value) {
self.jump(bytecode::Label::new(target_idx));
self.jump(bytecode::Label::from_u32(target_idx));
if self.monitoring_mask & monitoring::EVENT_BRANCH_RIGHT != 0 {
monitoring::fire_branch_right(vm, self.code, src_offset, target_idx * 2)?;
}
@@ -6233,7 +6365,7 @@ impl ExecutingFrame<'_> {
self.push_value(exception.into());
// 4. Jump to handler
self.jump(bytecode::Label::new(entry.target));
self.jump(bytecode::Label::from_u32(entry.target));
Ok(None)
} else {
@@ -6838,7 +6970,7 @@ impl ExecutingFrame<'_> {
bytecode::Instruction::EndFor | bytecode::Instruction::InstrumentedEndFor
)
{
return bytecode::Label::new(target.as_u32() + 1);
return bytecode::Label::from_u32(target.as_u32() + 1);
}
target
}
@@ -8819,7 +8951,7 @@ impl ExecutingFrame<'_> {
unit.op,
bytecode::Instruction::EndFor | bytecode::Instruction::InstrumentedEndFor
) {
bytecode::Label::new(target.as_u32() + 1)
bytecode::Label::from_u32(target.as_u32() + 1)
} else {
target
}
@@ -9324,20 +9456,25 @@ impl ExecutingFrame<'_> {
Ok(vm.ctx.new_tuple(list.borrow_vec().to_vec()).into())
}
bytecode::IntrinsicFunction1::StopIterationError => {
// Convert StopIteration to RuntimeError
// Used to ensure async generators don't raise StopIteration directly
// _PyGen_FetchStopIterationValue
// Use fast_isinstance to handle subclasses of StopIteration
// Convert StopIteration to RuntimeError (PEP 479)
// Returns the exception object; RERAISE will re-raise it
if arg.fast_isinstance(vm.ctx.exceptions.stop_iteration) {
Err(vm.new_runtime_error("coroutine raised StopIteration"))
let flags = &self.code.flags;
let msg = if flags
.contains(bytecode::CodeFlags::COROUTINE | bytecode::CodeFlags::GENERATOR)
{
"async generator raised StopIteration"
} else if flags.contains(bytecode::CodeFlags::COROUTINE) {
"coroutine raised StopIteration"
} else {
"generator raised StopIteration"
};
let err = vm.new_runtime_error(msg);
err.set___cause__(arg.downcast().ok());
Ok(err.into())
} else {
// If not StopIteration, just re-raise the original exception
Err(arg.downcast().unwrap_or_else(|obj| {
vm.new_runtime_error(format!(
"unexpected exception type: {:?}",
obj.class()
))
}))
// Not StopIteration, pass through for RERAISE
Ok(arg)
}
}
bytecode::IntrinsicFunction1::AsyncGenWrap => {

View File

@@ -457,12 +457,20 @@ impl GcState {
}
// Step 3: Subtract internal references
// Pre-compute referent pointers once per object so that both step 3
// (subtract refs) and step 4 (BFS reachability) see the same snapshot
// of each object's children. Without this, a dict whose write lock is
// held during one traversal but not the other can yield inconsistent
// results, causing live objects to be incorrectly collected.
let mut referents_map: std::collections::HashMap<GcPtr, Vec<NonNull<PyObject>>> =
std::collections::HashMap::new();
for &ptr in &collecting {
let obj = unsafe { ptr.0.as_ref() };
if obj.strong_count() == 0 {
continue;
}
let referent_ptrs = unsafe { obj.gc_get_referent_ptrs() };
referents_map.insert(ptr, referent_ptrs.clone());
for child_ptr in referent_ptrs {
let gc_ptr = GcPtr(child_ptr);
if collecting.contains(&gc_ptr)
@@ -487,7 +495,13 @@ impl GcState {
while let Some(ptr) = worklist.pop() {
let obj = unsafe { ptr.0.as_ref() };
if obj.is_gc_tracked() {
let referent_ptrs = unsafe { obj.gc_get_referent_ptrs() };
// Reuse the pre-computed referent pointers from step 3.
// For objects that were skipped in step 3 (strong_count was 0),
// compute them now as a fallback.
let referent_ptrs = referents_map
.get(&ptr)
.cloned()
.unwrap_or_else(|| unsafe { obj.gc_get_referent_ptrs() });
for child_ptr in referent_ptrs {
let gc_ptr = GcPtr(child_ptr);
if collecting.contains(&gc_ptr) && reachable.insert(gc_ptr) {

Some files were not shown because too many files have changed in this diff Show More