59 Commits

Author SHA1 Message Date
Shahar Naveh
9701c46d86 Clippy rules (test related) (#7968) 2026-05-25 22:03:18 +09:00
Shahar Naveh
ddfcb25957 Clippy nursery lints (#7875) 2026-05-15 10:38:36 +00:00
Shahar Naveh
460b1d39ed Clippy warn uninlined_format_args & redundant_else (#7873) 2026-05-15 16:52:06 +09:00
Shahar Naveh
e8711edd2d Clippy warn on unnecessary wraps (#7869) 2026-05-13 23:14:33 +09:00
Shahar Naveh
4a46e84eb9 Add map_unwrap_or clippy rule (#7829) 2026-05-12 00:51:23 +09:00
fanninpm
e4d35b08ea Move dependencies to workspace for common crate (#7771)
* Add dependencies from `common` crate to workspace

* Declare dependencies as workspace = true
2026-05-05 14:02:57 +09:00
Noa
02932384d6 Use cfg_select in a bunch more places (#7740) 2026-05-04 20:26:16 +00:00
Changjoon
83002d7369 Tighten CPython parity for str format spec, %-format, and str() constructor (#7769)
Five related CPython parity gaps in `str` formatting and construction:

1. **`str(bytes, errors=...)` triggers decode mode.** Previously, only
   `encoding=` triggered decode; passing only `errors=` fell back to
   `repr()`. CPython's behavior: presence of `encoding` OR `errors`
   triggers decode mode (default UTF-8 when only `errors` is given).

2. **`'{...}'.format() IndexError wording.** Generic Rust "tuple index
   out of range" replaced with CPython's "Replacement index N out of
   range for positional args tuple".

3. **`{0:3.2s}.format('abc')` → 'ab '.** String format spec applied
   precision after width padding; CPython truncates BEFORE padding.
   Reorder the operations.

4. **`%x` / `%o` / `%X` / `%c` accept `__index__` objects.** Previously
   only `PyInt` downcast was attempted. Mirror CPython's
   PyNumber_Index dispatch via `try_index_opt`.

5. **`%d` / `%u` / `%i` error wording.** "a number is required" →
   "a real number is required" (matches CPython).

Also adds `not <type>` suffix to `%c` error messages so the type is
visible in TypeError text (matches CPython structure even without
fully-qualified names).

Verified byte-identical with CPython 3.14.4 across 25+ probes covering
the format/spec/constructor combinations. Unmasks
`test_str.test_constructor_keyword_args` and
`test_str.test_constructor_defaults`. test_str/test_bytes/test_format/
test_codecs/test_io/test_unicode_identifiers — 1,429 tests pass, 0
regressions. All 188 `extra_tests/snippets/*.py` pass under the CI
feature set.

`test_str.test_format` and `test_str.test_formatting` markers retained:
`test_format` still trips on `'{0:08s}'.format('result')` (numeric
zero-pad treated as fill+left-align by CPython for str type — separate
format-spec parser concern). `test_formatting` still trips on
`%c` error message expecting fully qualified `module.qualname` (RP
returns bare class name — separate broader concern).
2026-05-04 10:28:43 +09:00
Shahar Naveh
ee5e9d0001 Enable some pedantic clippy lints (#7764) 2026-05-04 10:25:37 +09:00
Changjoon
c2910c06f3 Round float at the decimal level to match CPython's _Py_dg_dtoa (#7761)
* Round float at the decimal level to match CPython's _Py_dg_dtoa

CPython's `float.__round__` (Objects/floatobject.c) routes through
`_Py_dg_dtoa` and rounds at the decimal level. The previous
`round_float_digits` multiplied by 10**ndigits and rounded at the
IEEE 754 binary level, which diverges for values that aren't exactly
representable. For example, 2.675 stores as 2.67499...; dtoa correctly
rounds it down to 2.67, but `(2.675 * 100.0).round() / 100.0` lands on
2.68 because the multiplication produces a phantom 267.5 tie that
round-half-to-even snaps up.

Rust's `{:.*}` float formatting uses dtoa-style algorithms (Grisu3 +
Dragon4 fallback) and matches CPython's `_Py_dg_dtoa` byte-for-byte.
Replace the multiply-then-round step with `format!` + `parse` for
ndigits >= 0. The ndigits < 0 path is unchanged because dividing
typical inputs by 10**|ndigits| produces genuine ties rather than
synthesizing them.

Verified byte-identical with CPython 3.14.4 over a 108-case random
fuzz plus targeted half-tie probes. Unmasks
`test_float.RoundTestCase.test_matches_float_format` and
`test_previous_round_bugs`.

* Use #[expect] with reason for float_cmp suppression

Co-authored-by: ShaharNaveh <50263213+ShaharNaveh@users.noreply.github.com>

---------

Co-authored-by: ShaharNaveh <50263213+ShaharNaveh@users.noreply.github.com>
2026-05-03 00:16:27 +09:00
Shahar Naveh
926d69b50a Add some clippy lints (#7755) 2026-05-02 11:14:23 +09:00
Changjoon
dd1cbac692 Match CPython's _float_div_mod, fixing divmod and % zero-handling (#7745)
float_ops::divmod, mod_, and floordiv each carried their own conversion
from Rust's dividend-sign `%` to CPython's divisor-sign convention. Both
divmod and mod_ mishandled the zero-remainder case where the dividend
is a non-zero exact multiple of the divisor (e.g. divmod(6.0, -3.0),
6.0 % -3.0): the sign-correction branch fired on a zero remainder and
produced (-3.0, -3.0) and -3.0 respectively, violating the magnitude
invariant 0 <= abs(r) < abs(b). divmod also leaked the wrong signed-
zero quotient when the true quotient was zero (divmod(-1.0, -2.0)
returned (-0.0, -1.0) instead of (+0.0, -1.0)).

These are independent bugs in two functions, but both come from the
same root cause: zero-remainder needs a separate path from the sign-
correction branch.

Mirror CPython's `_float_div_mod` (Objects/floatobject.c) by making
divmod the canonical implementation and turning mod_ and floordiv into
thin wrappers. divmod(a, b) == (a // b, a % b) now holds by
construction.

Closes #7722
2026-05-01 19:45:28 +09:00
Changjoon
68aece59c9 Reject format-string field index above Py_ssize_t::MAX (#7708)
CPython rejects digit-only format-string field names that overflow
Py_ssize_t at parse time with ValueError: Too many decimal digits in
format string (Python/string_parser.c::get_integer). RustPython's
FieldName::parse accepted any digit string usize::from_str could parse,
producing IndexError or KeyError at lookup instead.

Cap the parsed index at isize::MAX (Py_ssize_t::MAX on every platform)
inside FieldName::parse. Also reject digits-only strings whose value
overflows usize itself (caught when parse_usize returns None on an
all-digit input). A new FormatParseError::TooManyDecimalDigits maps to
the byte-identical CPython wording.

Unmasks test_str.StrTest.test_format_huge_item_number.
2026-04-28 21:12:09 +09:00
Changjoon
b3d6d2f247 Reject format spec with width above i32::MAX (#7707)
CPython rejects format-spec widths that exceed Py_ssize_t::MAX with
ValueError: Too many decimal digits in format string. RustPython's
FormatSpec::_parse only capped precision (via parse_precision); width
was accepted up to usize::MAX, so values like sys.maxsize + 1 silently
produced an effectively-ignored width.

Reject any width above i32::MAX with FormatSpecError::DecimalDigitsTooMany,
matching the existing precision cap and producing the byte-identical
ValueError wording.

Unmasks test_str.StrTest.test_format_huge_width.
2026-04-28 21:11:33 +09:00
Changjoon
9794ab7fdf Enforce int_max_str_digits on int-to-str conversions (#7688)
* Enforce int_max_str_digits on int-to-str conversions

The str-to-int direction already enforced sys.get_int_max_str_digits()
via bytes_to_int; the int-to-str direction did not. CPython 3.14 enforces
both per PEP 644.

Adds check_int_to_str_digits helper in builtins::int (bit-count fast path
+ digit upper-bound from log10(2)), wired into the four Python-level
entry points: repr, the str fast path in protocol::object, int.__format__
(decimal/n/empty spec only — binary bases x/o/b are exempt per CPython),
and the DecimalD/I/U branches of vm::cformat for both str % and bytes %.

Unmasks 8 expectedFailure tests across test_int (max_str_digits, DoS
prevention, int_from_other_bases — each mirrored in IntSubclass),
test_ast (test_repr_large_input_crash) and test_reprlib (test_numbers).
Boundary cases (4299/4300/4301 digits at limit=4300) match CPython 3.14.4.

* Skip int-to-str DoS test on platforms without time.get_clock_info

The test_denial_of_service_prevented_int_to_str regression test uses
support.Stopwatch, which calls time.get_clock_info('monotonic'). In
RustPython that function is gated to unix/windows targets only, so on
wasm32-wasip1 it surfaces as AttributeError and breaks the wasm-wasi CI.
Guard the test with skipUnless(hasattr(time, 'get_clock_info'), ...) so
it runs everywhere it can and is skipped on wasm.

Also narrow is_decimal_int_format to Number(Case::Lower): 'N' is rejected
by format_int as UnknownFormatCode, so excluding it preserves that error
path instead of intercepting it with the digit-limit check.

* Add TODO: RUSTPYTHON marker to skipUnless reason

scripts/update_lib uses TODO: RUSTPYTHON markers inside unittest
decorator reason strings to identify and migrate custom RustPython
patches across CPython library updates.

* Use expectedFailureIf for wasm get_clock_info gap

skipUnless silently hides the test forever; expectedFailureIf surfaces
unexpected success once RustPython implements time.get_clock_info on
wasm, prompting marker removal.
2026-04-27 21:41:40 +09:00
Changjoon
71380bead9 Fix process abort on large float format precision (#7633)
* Fix process abort on large float format precision

Formatting a float with large precision (>= ~65535) aborted the
interpreter instead of raising a Python exception. CPython handles
the same input by returning a clean string.

  # Before
  ./rustpython -c "print(f'{1.5:.1000000}')"
  thread 'main' panicked at crates/literal/src/float.rs:135:
  Formatting argument out of range   (exit 101, abort)

  # After
  ./rustpython -c "print(f'{1.5:.1000000}')"
  1.5

Root cause: Rust's `format!("{:.*}", n, x)` panics when `n`
exceeds the fmt runtime's internal precision limit. `format_fixed`
already caps `n` at u16::MAX, but `format_general` and
`format_exponent` (and the `%` branch in `crates/common/src/format.rs`)
passed user-supplied precision straight through to `format!`.

Fix:

  * Introduce `FMT_MAX_PRECISION` + `clamp_fmt_precision()` in
    crates/literal/src/float.rs. Cap is `u16::MAX - 1` because
    `{:.*e}` hits a second panic (`ndigits > 0` in core flt2dec)
    at exactly u16::MAX; the smaller value covers both paths.
  * Apply the helper to `format_fixed` (replacing the existing
    ad-hoc cap), `format_exponent` (entry), and `format_general`
    (three separate format! calls with saturating arithmetic on
    derived precision values).
  * Apply the helper in the `FormatType::Percentage` branch in
    crates/common/src/format.rs.

This is harmless for all normal inputs — f64 carries only ~17
significant digits, so precision beyond 65K is padding zeros at
best. Complex-number and old-style `%`-formatting paths transitively
benefit because they dispatch to the same library functions.

Verified:

  * cargo run -- -m test test_float test_fstring test_format:
    144 passed, 0 regressed.
  * extra_tests/snippets/builtin_format.py: all assertions pass,
    including 7 new regression cases covering e / E / g / G / f /
    % at precision 1_000_000.
  * Probed with 10 magnitude values (0, ±1.5, ±inf, nan, 1e-300,
    1e300, f64::MAX, 5e-324) x 4 format types = 40 combinations,
    plus precision 0/1/2 boundary, complex formatting, old-style
    `%` formatting, and combined specs (fill/align/sign/grouping/
    zero-pad). All return clean strings; no process abort.

* Address CodeRabbit review: split cap + drop redundant clamp

Two refinements after CodeRabbit review:

1. Drop the redundant `format!("{:.*}", precision + 1, base)` in
   `format_general`'s scientific branch. It was a no-op pre-fix
   (magnitude is `.abs()`-ed at the caller, so `base` has no sign
   and its length was exactly `precision + 1`), but after I added
   the cap it turned into an active truncate — dropping 1 char of
   precision at the cap boundary. Reuse `base` directly and extract
   `exp_precision` for reuse by `decimal_point_or_empty`.

2. Split the cap into two helpers.

   `FMT_MAX_PRECISION = u16::MAX`           — for plain `{:.*}` (format_fixed,
                                                 %-branch, format_general's
                                                 non-scientific branch).
   `FMT_MAX_EXP_PRECISION = u16::MAX - 1`   — for `{:.*e}` (format_exponent,
                                                 format_general's scientific
                                                 entry).

   The second value is one lower because `{:.*e}` trips an additional
   `ndigits > 0` assertion in `core::num::flt2dec` at exactly
   `u16::MAX`. The first commit used the tighter cap uniformly,
   which silently regressed `format_fixed` by 1 char at
   `precision == u16::MAX` (it previously capped at exactly that
   value). Two helpers restore byte-identical CPython parity for
   fixed / percent / general-non-scientific paths up through
   `precision == u16::MAX`.

Verification:
  * precision 5 .. 65534:  360 outputs byte-identical to CPython
                           across 8 magnitudes x 9 precisions x 5 types.
  * precision == 65535:    f / g / G / % now match CPython (0 diff).
                           e / E remain 1 char shorter — unavoidable
                           within the `u16::MAX - 1` exp cap.
  * precision > 65535:     output stops at cap; CPython emits full
                           padding — same design divergence as before.
  * No panic regression:   f-string default, e/E, g/G, %, f at
                           precision 1_000_000 all return cleanly.
  * Test suite:            test_float + test_fstring + test_format,
                           162 passed, 0 regressed.

* Fix ruff format: single-line precision clamp

* Address @youknowone review: byte-identical CPython parity at boundary

Per review comment on `extra_tests/snippets/builtin_format.py:209`:
the patch declares `FMT_MAX_PRECISION = u16::MAX`, so the tests must
cover 65535 and 65536 and demonstrate CPython parity at the boundary.

The previous version only avoided panic — at the cap it silently
truncated 1 char short of CPython for e / E, and thousands of chars
short for f / %  at precision beyond the cap. This commit restores
byte-identical CPython output at every precision up to the format-
spec parser's own `i32::MAX` ceiling.

Fix: pad the Rust-format result with '0's up to the user-requested
precision.

Why this is correct, not a workaround: IEEE 754 double has at most
~767 significant decimal digits; past that, every digit is
deterministically '0' in both CPython and the native Rust output.
Our cap (65534 for exp, 65535 for plain) sits far above 767, so
appending zeros reconstructs precisely what CPython would have
produced. Verified on hard inputs: `1e-100`, `5e-324` (subnormal
boundary), `f64::MAX`, mixed magnitudes — the last 100 chars of
Rust-format output at precision 65534 are all '0' for every case.

Changes:

  * `format_fixed`: after format!(), extend with (precision - capped)
    '0' chars before appending the optional decimal point.
  * `format_exponent`: same, applied to the parsed mantissa before
    reassembling with the exponent marker.
  * `FormatType::Percentage` branch: same. Also fixed a bug the
    boundary audit surfaced: the finite-input overflow guard used
    `return Ok("inf%")`, which bypasses the outer sign handler.
    Changed to a match-arm value so `format_sign_and_align` still
    runs and produces "-inf%" for `-f64::MAX`, matching CPython.

Verification:

  * 7 magnitudes × 5 precisions × 6 format types = 210 comparisons
    against CPython at precisions {65534, 65535, 65536, 100000,
    200000}. All 210 byte-identical.
  * Gap audit (complex formatting, old-style % formatting, negative
    magnitudes, -0.0, combined specs with fill / sign / alternate /
    grouping) at boundary precisions. All but 20 byte-identical.
    The 20 remaining diffs all stem from a pre-existing
    complex-imaginary-part repr bug (`1e100j` expands to 100 '0's
    in RustPython vs CPython's `1e+100j`) which reproduces on
    upstream main without any part of this patch and is out of
    scope here.
  * `cargo run -- -m test test_float test_fstring test_format`:
    162 passed, 0 regressed.
  * `extra_tests/snippets/builtin_format.py` now pins exact
    expected strings at 65534 / 65535 / 65536 / 1_000_000 for
    every format type, plus the `f64::MAX × 100 → 'inf%'`
    overflow case.
  * `cargo fmt --check`: pass.

* Clarify boundary test labels + add past-cap depth assertions

Rename the boundary-test section so the three precision points per
format type are labeled below / at / past the cap inline, making the
"past MAX_PRECISION" unhappy-case coverage explicit. Add len-based
assertions at precision 1_000_000 for f, e, and % to exercise the
cap-then-pad path at a depth far beyond the boundary.
2026-04-23 15:25:52 +09:00
Shahar Naveh
dc65255fd2 Use cfg_select! (#7636) 2026-04-21 17:35:26 +00:00
Copilot
d5921d16af Create rustpython-host-env (#7582)
* Create rustpython-host-env crate; move host abstractions out of common

Move os, crt_fd, fileutils, windows, macros modules from
rustpython-common into the new rustpython-host-env crate.
This isolates host OS API wrappers behind a crate boundary
with zero Python runtime dependency.

- Add crates/host_env to workspace
- Drop nix, windows-sys, widestring deps from common
- Wire vm and stdlib to depend on rustpython-host-env
- Migrate all imports from common::{os,crt_fd,fileutils,windows}
  to rustpython_host_env::

* refactor: extract host helpers

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/48d1e64d-37ce-409f-b511-8e61a349665c

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* lint: enforce direct host API boundaries

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/97225fb7-7b3d-4197-a77c-eb44aead5b13

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* refactor: extract remaining host env helpers

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/d96f57e1-b196-4460-9983-97d5ff118835

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* fix: clean extracted host env follow-up

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/d96f57e1-b196-4460-9983-97d5ff118835

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* fix: document env mutation safety

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/d96f57e1-b196-4460-9983-97d5ff118835

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* refactor: split host fs helpers from fileutils

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/c57424c5-0e1d-490a-82b3-2d2f6c8cf2cd

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* fix: resolve latest host env ci regressions

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/899eb717-ebc6-4a4a-870c-2a15c5f33e02

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* fix: resolve remaining windows clippy host fs calls

Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/12f32740-8173-4b10-a1d6-00b29e90a8ec

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

* host_env

---------

Co-authored-by: Jeong, YunWon <jeong@youknowone.org>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
2026-04-21 00:13:25 +09:00
Jeong, YunWon
45d81296e4 More fork safety (#7380)
* apply more allow_threads

* Simplify STW thread state transitions

- Fix park_detached_threads: successful CAS no longer sets
  all_suspended=false, avoiding unnecessary polling rounds
- Replace park_timeout(50µs) with park() in wait_while_suspended
- Remove redundant self-suspension in attach_thread and detach_thread;
  the STW controller handles DETACHED→SUSPENDED via park_detached_threads
- Add double-check under mutex before condvar wait to prevent lost wakes
- Remove dead stats_detach_wait_yields field and add_detach_wait_yields

* Representable for ThreadHandle

* Set ThreadHandle state to Running in parent thread after spawn

Like CPython's ThreadHandle_start, set RUNNING state in the parent
thread immediately after spawn() succeeds, rather than in the child.
This eliminates a race where join() could see Starting state if called
before the child thread executes.

Also reverts the macOS skip for test_start_new_thread_failed since the
root cause is fixed.

* Set ThreadHandle state to Running in parent thread after spawn

* Add debug_assert for thread state in start_the_world

* Unskip now-passing test_get_event_loop_thread and test_start_new_thread_at_finalization

* Wrap IO locks and file ops in allow_threads

Add lock_wrapped to ThreadMutex for detaching thread state
while waiting on contended locks. Use it for buffered and
text IO locks. Wrap FileIO read/write in allow_threads via
crt_fd to prevent STW hangs on blocking file operations.

* Use std::sync for thread start/ready events

Replace parking_lot Mutex/Condvar with std::sync (pthread-based)
for started_event and handle_ready_event. This prevents hangs
in forked children where parking_lot's global HASHTABLE may be
corrupted.
2026-03-08 18:06:23 +09:00
Jeong, YunWon
2bb9173caf Suspend Python threads before fork() (#7364)
* Suspend Python threads before fork()

Add stop-the-world thread suspension around fork() to prevent
deadlocks from locks held by dead parent threads in the child.

- Thread states: DETACHED / ATTACHED / SUSPENDED with atomic CAS
  transitions matching _PyThreadState_{Attach,Detach,Suspend}
- stop_the_world / start_the_world: park all non-requester threads
  before fork, resume after (parent) or reset (child)
- allow_threads (Py_BEGIN/END_ALLOW_THREADS): detach around blocking
  syscalls (os.read/write, waitpid, Lock.acquire, time.sleep) so
  stop_the_world can force-park via CAS
- Acquire/release import lock around fork lifecycle
- zero_reinit_after_fork: generic lock reset for parking_lot types
- gc_clear_raw: detach dict instead of clearing entries
- Lock-free double-check for descriptor cache reads (no read-side
  seqlock); write-side seqlock retained for writer serialization
- fork() returns PyResult, checks PythonFinalizationError, calls
  sys.audit
2026-03-07 20:20:16 +09:00
Jeong, YunWon
3b91466f62 Implement locale-aware 'n' format specifier for int, float, complex (#7350)
* Implement locale-aware 'n' format specifier for int, float, complex

Add LocaleInfo struct and locale-aware formatting methods to FormatSpec.
The 'n' format type now reads thousands_sep, decimal_point, and grouping
from C localeconv() and applies proper locale-based number grouping.
Remove @unittest.skip from test_format.test_locale.

* Fix complex 'n' format and remove locale expectedFailure markers

Rewrite format_complex_locale to reuse format_complex_re_im, matching
formatter_unicode.c: add_parens=0 and skip_re=0 for 'n' type.
Remove @expectedFailure from test_float__format__locale and
test_int__format__locale in test_types.py.

* Auto-format: cargo fmt --all

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-06 01:09:43 +09:00
Jeong, YunWon
375b5472ed Fix thread-safety in GC, type cache, and instruction cache (#7355)
* Fix thread-safety in GC, type cache, and instruction cache

GC / refcount:
- Add safe_inc() check for strong()==0 in RefCount
- Add try_to_owned() to PyObject for atomic refcount acquire
- Replace strong_count()+to_owned() with try_to_owned() in GC
  collection and weakref callback paths to prevent TOCTOU races

Type cache:
- Add proper SeqLock (sequence counter) to TypeCacheEntry
- Readers spin-wait on odd sequence, validate after read
- Writers bracket updates with begin_write/end_write
- Use try_to_owned + pointer revalidation on read path
- Call modified() BEFORE attribute modification in set_attr

Instruction cache:
- Add pointer_cache (AtomicUsize array) to CodeUnits for
  single atomic pointer load/store (prevents torn reads)
- Add try_read_cached_descriptor with try_to_owned + pointer
  and version revalidation after increment
- Add write_cached_descriptor with version-bracketed writes

RLock:
- Fix release() to check is_owned_by_current_thread
- Add _release_save/_acquire_restore methods

* Fix RLock _acquire_restore tuple handling and unxfail threading test

* Align type cache seqlock writer protocol with CPython

* RLock: use single parking_lot level, track recursion manually

Instead of calling lock()/unlock() N times for recursion depth N,
keep parking_lot at 1 level and manage the count ourselves.
This makes acquire/release O(1) and matches CPython's
_PyRecursiveMutex approach (lock once + set level directly).

* Add try_to_owned_from_ptr to avoid &PyObject on stale ptrs

Use addr_of! to access ref_count directly from a raw pointer
without forming &PyObject first. Applied in type cache and
instruction cache hit paths where the pointer may be stale.

* Fix CI: spelling typo and xfail flaky test_thread_safety

- Fix "minimising" -> "minimizing" for cspell
- xfail test_thread_safety: dict iteration races with
  concurrent GC mutations in _finalizer_registry
2026-03-05 20:33:14 +09:00
Jeong, YunWon
5c29074596 Replace GC tracking HashSet with intrusive linked list (#7328)
* Replace GC tracking HashSet with intrusive linked list

Replace per-generation HashSet<GcObjectPtr> with intrusive doubly-linked
lists for GC object tracking. Each PyInner now carries gc_pointers
(prev/next) and gc_generation fields, enabling O(1) track/untrack
without hashing.

- Add gc_pointers (Pointers<PyObject>) and gc_generation (u8) to PyInner
- Implement GcLink trait for intrusive list integration
- Replace generation_objects/permanent_objects/tracked_objects/finalized_objects
  HashSets with generation_lists/permanent_list LinkedLists
- Use GcBits::FINALIZED flag instead of finalized_objects HashSet
- Change default_dealloc to untrack directly before memory free
- Hold both src/dst list locks in promote_survivors to prevent race
  conditions with concurrent untrack_object calls
- Add pop_front to LinkedList for freeze/unfreeze operations
Move unreachable_refs creation before drop(gen_locks) so that raw
pointer dereferences and refcount increments happen while generation
list read locks are held. Previously, after dropping read locks, other
threads could untrack and free objects, causing use-after-free when
creating strong references from the raw GcPtr pointers.
2026-03-05 00:42:18 +09:00
Jeong, YunWon
68ad332833 Remove Frame mutex and use DataStack bump allocator for LocalsPlus (#7333)
* Remove PyMutex<FrameState> from Frame, use UnsafeCell fields directly

Move stack, cells_frees, prev_line out of the mutex-protected FrameState
into Frame as FrameUnsafeCell fields. This eliminates mutex lock/unlock
overhead on every frame execution (with_exec).

Safety relies on the same single-threaded execution guarantee that
FastLocals already uses.

* Add thread-local DataStack for bump-allocating frame data

Introduce DataStack with linked chunks (16KB initial, doubling) and
push/pop bump allocation. Add datastack field to VirtualMachine.
Not yet wired to frame creation.

* Unify FastLocals and BoxVec stack into LocalsPlus

Replace separate FastLocals (Box<[Option<PyObjectRef>]>) and
BoxVec<Option<PyStackRef>> with a single LocalsPlus struct that
stores both in a contiguous Box<[usize]> array. The first
nlocalsplus slots are fastlocals and the rest is the evaluation
stack. Typed access is provided through transmute-based methods.

Remove BoxVec import from frame.rs.

* Use DataStack for LocalsPlus in non-generator function calls

Normal function calls now bump-allocate LocalsPlus data from the
per-thread DataStack instead of a separate heap allocation.
Generator/coroutine frames continue using heap allocation since
they outlive the call.

On frame exit, data is copied to the heap (materialize_to_heap)
to preserve locals for tracebacks, then the DataStack is popped.

VirtualMachine.datastack is wrapped in UnsafeCell for interior
mutability (safe because frame allocation is single-threaded LIFO).

* Fix clippy: import Layout from core::alloc instead of alloc::alloc

* Fix vectorcall compatibility with LocalsPlus API

Update vectorcall dispatch functions to use localsplus stack
accessors instead of direct stack field access. Add
stack_truncate method to LocalsPlus. Update vectorcall fast
path in function.rs to use datastack and fastlocals_mut().

* Add datastack, nlocalsplus, ncells, tstate to cspell dictionary

* Fix DataStack pop() for non-monotonic allocation addresses

Check both bounds of the current chunk when determining if a
pop base is in the current chunk. The previous check (base >=
chunk_start) fails on Windows where newer chunks may be
allocated at lower addresses than older ones.

* Fix stale comments: release_datastack -> materialize_localsplus

* Fix non-threading mode for parallel test execution

Two fixes for Cell-based types used in static items under non-threading
mode, which cause data races when Rust test runner uses parallel threads:

1. LazyLock: use std::sync::LazyLock when std is available instead of
   wrapping core::cell::LazyCell with a false `unsafe impl Sync`.
   The LazyCell wrapper is kept only for no-std (truly single-threaded).

2. gc_state: use static_cell! (thread-local in non-threading mode)
   instead of OnceLock, so each thread gets its own GcState with
   Cell-based PyRwLock/PyMutex that are not accessed concurrently.

* Fix CallAllocAndEnterInit to use LocalsPlus stack API

* Use checked arithmetic in LocalsPlus and DataStack allocators

* Address code review: checked arithmetic, threading feature deps, Send gate

- Use checked arithmetic for nlocalsplus in Frame::new
- Add "std" to threading feature dependencies in rustpython-common
- Gate GcState Send impl with #[cfg(feature = "threading")]

* Clean up comments: remove redundant/stale remarks, fix CPython references
2026-03-04 23:27:36 +09:00
Jeong, YunWon
c55a9ff728 Reinit IO buffer locks after fork to prevent deadlocks (#7339)
* Reinit IO buffer locks after fork to prevent deadlocks

BufferedReader/Writer/TextIOWrapper use PyThreadMutex internally.
If a parent thread held one of these locks during fork(), the child
would deadlock on any IO operation.

Add reinit_after_fork() to RawThreadMutex and call it on sys.stdin/
stdout/stderr in the child process fork handler, analogous to
CPython's _PyIO_Reinit().

* Address review: unsafe fn + decoder lock reinit

- Mark reinit_std_streams_after_fork as unsafe fn to encode
  fork-only precondition, update call site in posix.rs
- Reinit IncrementalNewlineDecoder's PyThreadMutex via
  TextIOWrapper's decoder field to prevent child deadlocks

* Auto-format: cargo fmt --all

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-04 19:54:10 +09:00
Jeong, YunWon
7eb18210ca reinit-after-fork (#7321)
* Use Mutex::raw() accessor in reinit_mutex_after_fork

Use lock_api's Mutex::raw() to access the underlying RawMutex
instead of casting &PyMutex<T> directly. This avoids layout
assumptions about lock_api::Mutex<R, T> field ordering.

* Replace force_unlock with reinit_*_after_fork

Replace all force_unlock() + try_lock() patterns with zero-based
reinit that bypasses parking_lot internals entirely. After fork(),
the child is single-threaded so reinited locks won't contend.

Add reinit_rwlock_after_fork to common::lock alongside the existing
reinit_mutex_after_fork. Replace force_unlock_after_fork methods in
codecs, intern, and gc_state with reinit_after_fork equivalents.

This fixes after_fork_child silently dropping thread handles when
try_lock() failed on per-handle Arc<Mutex> locks.
2026-03-03 11:25:47 +09:00
Jeong, YunWon
9a12a8d532 Fix _at_fork_reinit to write INIT directly instead of calling unlock() (#7312)
* Fix _at_fork_reinit to write INIT directly instead of calling unlock()

unlock() goes through unlock_slow() which accesses parking_lot's
global hash table to unpark waiters. After fork(), this hash table
contains stale entries from dead parent threads, making unlock_slow()
unsafe. Writing INIT directly bypasses parking_lot internals entirely.

* Add import lock (IMP_LOCK) reinit after fork

The import lock is a ReentrantMutex that was never reinit'd after
fork(). If a parent thread held it during fork, the child would
deadlock on any import. Only reset if the owner is a dead thread;
if the surviving thread held it, normal unlock still works.
2026-03-03 09:08:12 +09:00
Jeong, YunWon
2b084457ef Optimize fast_locals and atomic ordering (#7289)
* Relax RefCount atomic ordering from SeqCst to Arc pattern

- inc/inc_by/get: SeqCst → Relaxed
- safe_inc CAS: SeqCst → Relaxed + compare_exchange_weak
- dec: SeqCst → Release + Acquire fence when count drops to 0
- leak CAS: SeqCst → AcqRel/Relaxed + compare_exchange_weak

* Reuse existing Vec via prepend_arg in execute_call

Replace vec![self_val] + extend(args.args) with
FuncArgs::prepend_arg() to avoid a second heap allocation
on every method call.

* Skip downcast_ref checks in invoke when tracing is disabled

Early return in PyCallable::invoke() when use_tracing is false,
avoiding two downcast_ref type checks on every function call.

* Replace fastlocals PyMutex with UnsafeCell-based FastLocals

Eliminate per-instruction mutex lock/unlock overhead for local
variable access. FastLocals uses UnsafeCell with safety guaranteed
by the frame's state mutex and sequential same-thread execution.

Affects 14+ lock() call sites in hot instruction paths (LoadFast,
StoreFast, DeleteFast, and their paired variants).

* Auto-format: cargo fmt --all

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-01 20:53:12 +09:00
Jeong, YunWon
93099e35e7 Remove PyStr::as_str, use as_wtf8/PyUtf8Str instead (#7218)
- Remove as_str() from PyStr/Py<PyStr> (was panicking on surrogates)
- Add Wtf8Concat trait and concat! macro for WTF-8 formatting
- Add impl From<&str> for &Wtf8 conversion
- Add AsPyStr/DictKey impls for PyUtf8Str types
- Migrate all call sites to as_wtf8(), to_str(), or PyUtf8Str
- Fix exception message APIs to accept Wtf8Buf
- Deduplicate inner-scope imports across modules
2026-02-27 04:20:11 +09:00
Jeong, YunWon
c98215ab3a Clear frame locals and stack on generator close + Add dir_fd support for rmdir, remove/unlink, scandir (#7222)
* Clear frame locals and stack on generator close

Add Frame::clear_locals_and_stack() to release references held by
closed generators/coroutines, matching _PyFrame_ClearLocals behavior.
Call it from Coro::close() after marking the coroutine as closed.

Update test_generators.py expectedFailure markers accordingly.

* Add dir_fd support for rmdir, remove/unlink, scandir

- rmdir: use unlinkat(fd, path, AT_REMOVEDIR) when dir_fd given
- remove/unlink: use unlinkat(fd, path, 0) when dir_fd given
- scandir: accept fd via fdopendir, add ScandirIteratorFd
- listdir: rewrite fd path to use raw readdir instead of nix::dir::Dir
- DirEntry: add d_type and dir_fd fields for fd-based scandir
- Update supports_fd/supports_dir_fd entries accordingly

* cells_free
2026-02-27 01:58:33 +09:00
Jeong, YunWon
6950baf687 more algorithm-independent GC infra (#7194)
* mark poluting tests

* GC-infra independent to EBR

* trashcan

* add overflow guard to inc(), #[must_use] on dec()/safe_inc(), trashcan debug_assert, weakref generic re-check
2026-02-22 21:31:42 +09:00
Jeong, YunWon
b87386f4fc Update test_fstring from v3.14.3 and impl more (#7164)
* Update test_fstring from v3.14.3

* Fix 6 test_fstring expectedFailure tests

- Add Unknown(char) variant to FormatType for proper error messages
  on unrecognized format codes (test_errors)
- Strip comments from f-string debug text in compile.rs
  (test_debug_conversion)
- Map ruff SyntaxError messages to match CPython in vm_new.rs:
  InvalidDeleteTarget, LineContinuationError, UnclosedStringError,
  OtherError(bytes mixing), OtherError(keyword identifier),
  FStringError(UnterminatedString/UnterminatedTripleQuotedString),
  and backtick-to-quote replacement for FStringError messages

* Fix clippy::sliced_string_as_bytes warning

---------

Co-authored-by: CPython Developers <>
2026-02-17 16:49:59 +09:00
Jeong, YunWon
ccd3d4f964 Replace std::sync::LazyLock with common::lock::LazyLock (#7079) 2026-02-11 16:09:42 +09:00
Jeong, YunWon
c06cf56c60 Replace once_cell with std::sync::OnceLock/core::cell::OnceCell (#7077)
* Replace `once_cell` with `std::sync::OnceLock`/`core::cell::OnceCell`

- Replace `once_cell::sync::{Lazy, OnceCell}` with
  `std::sync::{LazyLock, OnceLock}`
- Replace `once_cell::unsync::{Lazy, OnceCell}` with
  `core::cell::{LazyCell, OnceCell}`
- Inline `get_or_try_init` at call sites (unstable in std as of 1.93)
- Replace `OnceCell::with_value()` with `OnceCell::from()` in codecs.rs
- Remove `once_cell` direct dependency from common and vm crates

* Auto-format: cargo fmt --all

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-02-11 11:08:16 +09:00
Jeong, YunWon
570d50c67f no_std for common, pylib, codegen (#7056)
* `std` feature for common

- Gate OS-dependent modules behind `#[cfg(feature = "std")]`
- Replace `std::f64` with `core::f64` in float_ops
- Replace `std::process::abort` with panic in refcount
- Remove `thread_local` from levenshtein (stack buffer)
- Split static_cell into threading/non_threading/no_std

* `std` for codegen

* `no_std` for pylib
2026-02-09 23:28:20 +09:00
Jeong, YunWon
5dabad6702 reason inside #[allow] (#7049) 2026-02-08 13:11:50 +00:00
Jeong, YunWon
07fc6ee3c7 no_std clippy (#7043) 2026-02-08 16:49:18 +09:00
Jeong, YunWon
afea16569b Fix test_io expectedFailures 2026-02-06 00:15:18 +09:00
Noa
5bf13e8642 Switch to Cell::update, slice::{split_first_chunk,split_off}, where appropriate (#6974)
* Use Cell::update, slice::{split_first_chunk,split_off}

* Use more array -> slice methods
2026-02-03 13:45:03 +09:00
Jeong, YunWon
5e732c5e2a Fix wasip2 build (#6935) 2026-02-02 01:26:17 +09:00
Jeong, YunWon
60bec8a561 rework weakref (#6916)
* Replace WeakListInner with inline atomic weakref list and stripe locks

Remove heap-allocated WeakListInner (OncePtr<PyMutex<WeakListInner>>).
WeakRefList now holds two inline atomic pointers (head, generic).
PyWeak.parent replaced with wr_object pointing directly to referent.
Add weakref_lock module with AtomicU8 spinlock array for thread safety.
Rewrite upgrade/clear/drop_inner/count/get_weak_references with stripe lock.
Make Pointers methods public in linked_list.rs.
Remove expectedFailure from test_subclass_refs_dont_replace_standard_refs.

* Auto-format: cargo fmt --all

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-02-01 12:12:33 +09:00
Jeong, YunWon
ba8749b792 Align del behavior (#6772)
* slot_del

* refcount inc_by for atomicity

* temp patch multithreading

* apply review
2026-01-31 23:09:10 +09:00
Jeong, YunWon
fcca0feb70 Add missing windows APIs 2026-01-29 16:48:33 +09:00
ShaharNaveh
a9364cbc52 Fix warnings 2026-01-23 16:41:19 +02:00
Jeong YunWon
20376451eb Implement Py_mod_create slot support in multi-phase init 2026-01-22 11:21:42 +09:00
Noa
bf80d2715d Update to windows-2025 runner on ci (#5571)
* Update to windows-2025 on ci

* Unmark unexpected successes

* Try adding .dll
2026-01-07 12:53:44 +09:00
Jeong, YunWon
dac236dac0 more no_std clippy (#6587) 2025-12-30 15:15:02 +09:00
Terry Tianlin Luan
1464d5ca43 Adding + Fixing Clippy rules to better align with #[no_std] (#6570)
* * Added alloc_instead_of_core, std_instead_of_alloc, and std_instead_of_core clippy rules
* Manually changed part of the code to use core/alloc

* use clippy --fix to fix issues in stdlib

* * Used clippy --fix to fix issues in vm
* Imported Range in vm/src/anystr.rs

* * Used clippy --fix to fix issues in common
2025-12-30 13:10:14 +09:00
Jeong, YunWon
3600b6652d update _pyio, test_fileio from 3.13.11 and impl more io features (#6560)
* Update _pyio, test_fileio from 3.13.11

* impl more io

* unmark sucessful tests

* fix windows fileio
2025-12-28 18:06:47 +09:00
Jeong, YunWon
4bec0ad1c6 Fix test_runpy (#6409) 2025-12-12 15:26:00 +09:00