Five related CPython parity gaps in `str` formatting and construction:
1. **`str(bytes, errors=...)` triggers decode mode.** Previously, only
`encoding=` triggered decode; passing only `errors=` fell back to
`repr()`. CPython's behavior: presence of `encoding` OR `errors`
triggers decode mode (default UTF-8 when only `errors` is given).
2. **`'{...}'.format() IndexError wording.** Generic Rust "tuple index
out of range" replaced with CPython's "Replacement index N out of
range for positional args tuple".
3. **`{0:3.2s}.format('abc')` → 'ab '.** String format spec applied
precision after width padding; CPython truncates BEFORE padding.
Reorder the operations.
4. **`%x` / `%o` / `%X` / `%c` accept `__index__` objects.** Previously
only `PyInt` downcast was attempted. Mirror CPython's
PyNumber_Index dispatch via `try_index_opt`.
5. **`%d` / `%u` / `%i` error wording.** "a number is required" →
"a real number is required" (matches CPython).
Also adds `not <type>` suffix to `%c` error messages so the type is
visible in TypeError text (matches CPython structure even without
fully-qualified names).
Verified byte-identical with CPython 3.14.4 across 25+ probes covering
the format/spec/constructor combinations. Unmasks
`test_str.test_constructor_keyword_args` and
`test_str.test_constructor_defaults`. test_str/test_bytes/test_format/
test_codecs/test_io/test_unicode_identifiers — 1,429 tests pass, 0
regressions. All 188 `extra_tests/snippets/*.py` pass under the CI
feature set.
`test_str.test_format` and `test_str.test_formatting` markers retained:
`test_format` still trips on `'{0:08s}'.format('result')` (numeric
zero-pad treated as fill+left-align by CPython for str type — separate
format-spec parser concern). `test_formatting` still trips on
`%c` error message expecting fully qualified `module.qualname` (RP
returns bare class name — separate broader concern).
* Round float at the decimal level to match CPython's _Py_dg_dtoa
CPython's `float.__round__` (Objects/floatobject.c) routes through
`_Py_dg_dtoa` and rounds at the decimal level. The previous
`round_float_digits` multiplied by 10**ndigits and rounded at the
IEEE 754 binary level, which diverges for values that aren't exactly
representable. For example, 2.675 stores as 2.67499...; dtoa correctly
rounds it down to 2.67, but `(2.675 * 100.0).round() / 100.0` lands on
2.68 because the multiplication produces a phantom 267.5 tie that
round-half-to-even snaps up.
Rust's `{:.*}` float formatting uses dtoa-style algorithms (Grisu3 +
Dragon4 fallback) and matches CPython's `_Py_dg_dtoa` byte-for-byte.
Replace the multiply-then-round step with `format!` + `parse` for
ndigits >= 0. The ndigits < 0 path is unchanged because dividing
typical inputs by 10**|ndigits| produces genuine ties rather than
synthesizing them.
Verified byte-identical with CPython 3.14.4 over a 108-case random
fuzz plus targeted half-tie probes. Unmasks
`test_float.RoundTestCase.test_matches_float_format` and
`test_previous_round_bugs`.
* Use #[expect] with reason for float_cmp suppression
Co-authored-by: ShaharNaveh <50263213+ShaharNaveh@users.noreply.github.com>
---------
Co-authored-by: ShaharNaveh <50263213+ShaharNaveh@users.noreply.github.com>
float_ops::divmod, mod_, and floordiv each carried their own conversion
from Rust's dividend-sign `%` to CPython's divisor-sign convention. Both
divmod and mod_ mishandled the zero-remainder case where the dividend
is a non-zero exact multiple of the divisor (e.g. divmod(6.0, -3.0),
6.0 % -3.0): the sign-correction branch fired on a zero remainder and
produced (-3.0, -3.0) and -3.0 respectively, violating the magnitude
invariant 0 <= abs(r) < abs(b). divmod also leaked the wrong signed-
zero quotient when the true quotient was zero (divmod(-1.0, -2.0)
returned (-0.0, -1.0) instead of (+0.0, -1.0)).
These are independent bugs in two functions, but both come from the
same root cause: zero-remainder needs a separate path from the sign-
correction branch.
Mirror CPython's `_float_div_mod` (Objects/floatobject.c) by making
divmod the canonical implementation and turning mod_ and floordiv into
thin wrappers. divmod(a, b) == (a // b, a % b) now holds by
construction.
Closes#7722
CPython rejects digit-only format-string field names that overflow
Py_ssize_t at parse time with ValueError: Too many decimal digits in
format string (Python/string_parser.c::get_integer). RustPython's
FieldName::parse accepted any digit string usize::from_str could parse,
producing IndexError or KeyError at lookup instead.
Cap the parsed index at isize::MAX (Py_ssize_t::MAX on every platform)
inside FieldName::parse. Also reject digits-only strings whose value
overflows usize itself (caught when parse_usize returns None on an
all-digit input). A new FormatParseError::TooManyDecimalDigits maps to
the byte-identical CPython wording.
Unmasks test_str.StrTest.test_format_huge_item_number.
CPython rejects format-spec widths that exceed Py_ssize_t::MAX with
ValueError: Too many decimal digits in format string. RustPython's
FormatSpec::_parse only capped precision (via parse_precision); width
was accepted up to usize::MAX, so values like sys.maxsize + 1 silently
produced an effectively-ignored width.
Reject any width above i32::MAX with FormatSpecError::DecimalDigitsTooMany,
matching the existing precision cap and producing the byte-identical
ValueError wording.
Unmasks test_str.StrTest.test_format_huge_width.
* Enforce int_max_str_digits on int-to-str conversions
The str-to-int direction already enforced sys.get_int_max_str_digits()
via bytes_to_int; the int-to-str direction did not. CPython 3.14 enforces
both per PEP 644.
Adds check_int_to_str_digits helper in builtins::int (bit-count fast path
+ digit upper-bound from log10(2)), wired into the four Python-level
entry points: repr, the str fast path in protocol::object, int.__format__
(decimal/n/empty spec only — binary bases x/o/b are exempt per CPython),
and the DecimalD/I/U branches of vm::cformat for both str % and bytes %.
Unmasks 8 expectedFailure tests across test_int (max_str_digits, DoS
prevention, int_from_other_bases — each mirrored in IntSubclass),
test_ast (test_repr_large_input_crash) and test_reprlib (test_numbers).
Boundary cases (4299/4300/4301 digits at limit=4300) match CPython 3.14.4.
* Skip int-to-str DoS test on platforms without time.get_clock_info
The test_denial_of_service_prevented_int_to_str regression test uses
support.Stopwatch, which calls time.get_clock_info('monotonic'). In
RustPython that function is gated to unix/windows targets only, so on
wasm32-wasip1 it surfaces as AttributeError and breaks the wasm-wasi CI.
Guard the test with skipUnless(hasattr(time, 'get_clock_info'), ...) so
it runs everywhere it can and is skipped on wasm.
Also narrow is_decimal_int_format to Number(Case::Lower): 'N' is rejected
by format_int as UnknownFormatCode, so excluding it preserves that error
path instead of intercepting it with the digit-limit check.
* Add TODO: RUSTPYTHON marker to skipUnless reason
scripts/update_lib uses TODO: RUSTPYTHON markers inside unittest
decorator reason strings to identify and migrate custom RustPython
patches across CPython library updates.
* Use expectedFailureIf for wasm get_clock_info gap
skipUnless silently hides the test forever; expectedFailureIf surfaces
unexpected success once RustPython implements time.get_clock_info on
wasm, prompting marker removal.
* Fix process abort on large float format precision
Formatting a float with large precision (>= ~65535) aborted the
interpreter instead of raising a Python exception. CPython handles
the same input by returning a clean string.
# Before
./rustpython -c "print(f'{1.5:.1000000}')"
thread 'main' panicked at crates/literal/src/float.rs:135:
Formatting argument out of range (exit 101, abort)
# After
./rustpython -c "print(f'{1.5:.1000000}')"
1.5
Root cause: Rust's `format!("{:.*}", n, x)` panics when `n`
exceeds the fmt runtime's internal precision limit. `format_fixed`
already caps `n` at u16::MAX, but `format_general` and
`format_exponent` (and the `%` branch in `crates/common/src/format.rs`)
passed user-supplied precision straight through to `format!`.
Fix:
* Introduce `FMT_MAX_PRECISION` + `clamp_fmt_precision()` in
crates/literal/src/float.rs. Cap is `u16::MAX - 1` because
`{:.*e}` hits a second panic (`ndigits > 0` in core flt2dec)
at exactly u16::MAX; the smaller value covers both paths.
* Apply the helper to `format_fixed` (replacing the existing
ad-hoc cap), `format_exponent` (entry), and `format_general`
(three separate format! calls with saturating arithmetic on
derived precision values).
* Apply the helper in the `FormatType::Percentage` branch in
crates/common/src/format.rs.
This is harmless for all normal inputs — f64 carries only ~17
significant digits, so precision beyond 65K is padding zeros at
best. Complex-number and old-style `%`-formatting paths transitively
benefit because they dispatch to the same library functions.
Verified:
* cargo run -- -m test test_float test_fstring test_format:
144 passed, 0 regressed.
* extra_tests/snippets/builtin_format.py: all assertions pass,
including 7 new regression cases covering e / E / g / G / f /
% at precision 1_000_000.
* Probed with 10 magnitude values (0, ±1.5, ±inf, nan, 1e-300,
1e300, f64::MAX, 5e-324) x 4 format types = 40 combinations,
plus precision 0/1/2 boundary, complex formatting, old-style
`%` formatting, and combined specs (fill/align/sign/grouping/
zero-pad). All return clean strings; no process abort.
* Address CodeRabbit review: split cap + drop redundant clamp
Two refinements after CodeRabbit review:
1. Drop the redundant `format!("{:.*}", precision + 1, base)` in
`format_general`'s scientific branch. It was a no-op pre-fix
(magnitude is `.abs()`-ed at the caller, so `base` has no sign
and its length was exactly `precision + 1`), but after I added
the cap it turned into an active truncate — dropping 1 char of
precision at the cap boundary. Reuse `base` directly and extract
`exp_precision` for reuse by `decimal_point_or_empty`.
2. Split the cap into two helpers.
`FMT_MAX_PRECISION = u16::MAX` — for plain `{:.*}` (format_fixed,
%-branch, format_general's
non-scientific branch).
`FMT_MAX_EXP_PRECISION = u16::MAX - 1` — for `{:.*e}` (format_exponent,
format_general's scientific
entry).
The second value is one lower because `{:.*e}` trips an additional
`ndigits > 0` assertion in `core::num::flt2dec` at exactly
`u16::MAX`. The first commit used the tighter cap uniformly,
which silently regressed `format_fixed` by 1 char at
`precision == u16::MAX` (it previously capped at exactly that
value). Two helpers restore byte-identical CPython parity for
fixed / percent / general-non-scientific paths up through
`precision == u16::MAX`.
Verification:
* precision 5 .. 65534: 360 outputs byte-identical to CPython
across 8 magnitudes x 9 precisions x 5 types.
* precision == 65535: f / g / G / % now match CPython (0 diff).
e / E remain 1 char shorter — unavoidable
within the `u16::MAX - 1` exp cap.
* precision > 65535: output stops at cap; CPython emits full
padding — same design divergence as before.
* No panic regression: f-string default, e/E, g/G, %, f at
precision 1_000_000 all return cleanly.
* Test suite: test_float + test_fstring + test_format,
162 passed, 0 regressed.
* Fix ruff format: single-line precision clamp
* Address @youknowone review: byte-identical CPython parity at boundary
Per review comment on `extra_tests/snippets/builtin_format.py:209`:
the patch declares `FMT_MAX_PRECISION = u16::MAX`, so the tests must
cover 65535 and 65536 and demonstrate CPython parity at the boundary.
The previous version only avoided panic — at the cap it silently
truncated 1 char short of CPython for e / E, and thousands of chars
short for f / % at precision beyond the cap. This commit restores
byte-identical CPython output at every precision up to the format-
spec parser's own `i32::MAX` ceiling.
Fix: pad the Rust-format result with '0's up to the user-requested
precision.
Why this is correct, not a workaround: IEEE 754 double has at most
~767 significant decimal digits; past that, every digit is
deterministically '0' in both CPython and the native Rust output.
Our cap (65534 for exp, 65535 for plain) sits far above 767, so
appending zeros reconstructs precisely what CPython would have
produced. Verified on hard inputs: `1e-100`, `5e-324` (subnormal
boundary), `f64::MAX`, mixed magnitudes — the last 100 chars of
Rust-format output at precision 65534 are all '0' for every case.
Changes:
* `format_fixed`: after format!(), extend with (precision - capped)
'0' chars before appending the optional decimal point.
* `format_exponent`: same, applied to the parsed mantissa before
reassembling with the exponent marker.
* `FormatType::Percentage` branch: same. Also fixed a bug the
boundary audit surfaced: the finite-input overflow guard used
`return Ok("inf%")`, which bypasses the outer sign handler.
Changed to a match-arm value so `format_sign_and_align` still
runs and produces "-inf%" for `-f64::MAX`, matching CPython.
Verification:
* 7 magnitudes × 5 precisions × 6 format types = 210 comparisons
against CPython at precisions {65534, 65535, 65536, 100000,
200000}. All 210 byte-identical.
* Gap audit (complex formatting, old-style % formatting, negative
magnitudes, -0.0, combined specs with fill / sign / alternate /
grouping) at boundary precisions. All but 20 byte-identical.
The 20 remaining diffs all stem from a pre-existing
complex-imaginary-part repr bug (`1e100j` expands to 100 '0's
in RustPython vs CPython's `1e+100j`) which reproduces on
upstream main without any part of this patch and is out of
scope here.
* `cargo run -- -m test test_float test_fstring test_format`:
162 passed, 0 regressed.
* `extra_tests/snippets/builtin_format.py` now pins exact
expected strings at 65534 / 65535 / 65536 / 1_000_000 for
every format type, plus the `f64::MAX × 100 → 'inf%'`
overflow case.
* `cargo fmt --check`: pass.
* Clarify boundary test labels + add past-cap depth assertions
Rename the boundary-test section so the three precision points per
format type are labeled below / at / past the cap inline, making the
"past MAX_PRECISION" unhappy-case coverage explicit. Add len-based
assertions at precision 1_000_000 for f, e, and % to exercise the
cap-then-pad path at a depth far beyond the boundary.
* apply more allow_threads
* Simplify STW thread state transitions
- Fix park_detached_threads: successful CAS no longer sets
all_suspended=false, avoiding unnecessary polling rounds
- Replace park_timeout(50µs) with park() in wait_while_suspended
- Remove redundant self-suspension in attach_thread and detach_thread;
the STW controller handles DETACHED→SUSPENDED via park_detached_threads
- Add double-check under mutex before condvar wait to prevent lost wakes
- Remove dead stats_detach_wait_yields field and add_detach_wait_yields
* Representable for ThreadHandle
* Set ThreadHandle state to Running in parent thread after spawn
Like CPython's ThreadHandle_start, set RUNNING state in the parent
thread immediately after spawn() succeeds, rather than in the child.
This eliminates a race where join() could see Starting state if called
before the child thread executes.
Also reverts the macOS skip for test_start_new_thread_failed since the
root cause is fixed.
* Set ThreadHandle state to Running in parent thread after spawn
* Add debug_assert for thread state in start_the_world
* Unskip now-passing test_get_event_loop_thread and test_start_new_thread_at_finalization
* Wrap IO locks and file ops in allow_threads
Add lock_wrapped to ThreadMutex for detaching thread state
while waiting on contended locks. Use it for buffered and
text IO locks. Wrap FileIO read/write in allow_threads via
crt_fd to prevent STW hangs on blocking file operations.
* Use std::sync for thread start/ready events
Replace parking_lot Mutex/Condvar with std::sync (pthread-based)
for started_event and handle_ready_event. This prevents hangs
in forked children where parking_lot's global HASHTABLE may be
corrupted.
* Suspend Python threads before fork()
Add stop-the-world thread suspension around fork() to prevent
deadlocks from locks held by dead parent threads in the child.
- Thread states: DETACHED / ATTACHED / SUSPENDED with atomic CAS
transitions matching _PyThreadState_{Attach,Detach,Suspend}
- stop_the_world / start_the_world: park all non-requester threads
before fork, resume after (parent) or reset (child)
- allow_threads (Py_BEGIN/END_ALLOW_THREADS): detach around blocking
syscalls (os.read/write, waitpid, Lock.acquire, time.sleep) so
stop_the_world can force-park via CAS
- Acquire/release import lock around fork lifecycle
- zero_reinit_after_fork: generic lock reset for parking_lot types
- gc_clear_raw: detach dict instead of clearing entries
- Lock-free double-check for descriptor cache reads (no read-side
seqlock); write-side seqlock retained for writer serialization
- fork() returns PyResult, checks PythonFinalizationError, calls
sys.audit
* Implement locale-aware 'n' format specifier for int, float, complex
Add LocaleInfo struct and locale-aware formatting methods to FormatSpec.
The 'n' format type now reads thousands_sep, decimal_point, and grouping
from C localeconv() and applies proper locale-based number grouping.
Remove @unittest.skip from test_format.test_locale.
* Fix complex 'n' format and remove locale expectedFailure markers
Rewrite format_complex_locale to reuse format_complex_re_im, matching
formatter_unicode.c: add_parens=0 and skip_re=0 for 'n' type.
Remove @expectedFailure from test_float__format__locale and
test_int__format__locale in test_types.py.
* Auto-format: cargo fmt --all
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Fix thread-safety in GC, type cache, and instruction cache
GC / refcount:
- Add safe_inc() check for strong()==0 in RefCount
- Add try_to_owned() to PyObject for atomic refcount acquire
- Replace strong_count()+to_owned() with try_to_owned() in GC
collection and weakref callback paths to prevent TOCTOU races
Type cache:
- Add proper SeqLock (sequence counter) to TypeCacheEntry
- Readers spin-wait on odd sequence, validate after read
- Writers bracket updates with begin_write/end_write
- Use try_to_owned + pointer revalidation on read path
- Call modified() BEFORE attribute modification in set_attr
Instruction cache:
- Add pointer_cache (AtomicUsize array) to CodeUnits for
single atomic pointer load/store (prevents torn reads)
- Add try_read_cached_descriptor with try_to_owned + pointer
and version revalidation after increment
- Add write_cached_descriptor with version-bracketed writes
RLock:
- Fix release() to check is_owned_by_current_thread
- Add _release_save/_acquire_restore methods
* Fix RLock _acquire_restore tuple handling and unxfail threading test
* Align type cache seqlock writer protocol with CPython
* RLock: use single parking_lot level, track recursion manually
Instead of calling lock()/unlock() N times for recursion depth N,
keep parking_lot at 1 level and manage the count ourselves.
This makes acquire/release O(1) and matches CPython's
_PyRecursiveMutex approach (lock once + set level directly).
* Add try_to_owned_from_ptr to avoid &PyObject on stale ptrs
Use addr_of! to access ref_count directly from a raw pointer
without forming &PyObject first. Applied in type cache and
instruction cache hit paths where the pointer may be stale.
* Fix CI: spelling typo and xfail flaky test_thread_safety
- Fix "minimising" -> "minimizing" for cspell
- xfail test_thread_safety: dict iteration races with
concurrent GC mutations in _finalizer_registry
* Replace GC tracking HashSet with intrusive linked list
Replace per-generation HashSet<GcObjectPtr> with intrusive doubly-linked
lists for GC object tracking. Each PyInner now carries gc_pointers
(prev/next) and gc_generation fields, enabling O(1) track/untrack
without hashing.
- Add gc_pointers (Pointers<PyObject>) and gc_generation (u8) to PyInner
- Implement GcLink trait for intrusive list integration
- Replace generation_objects/permanent_objects/tracked_objects/finalized_objects
HashSets with generation_lists/permanent_list LinkedLists
- Use GcBits::FINALIZED flag instead of finalized_objects HashSet
- Change default_dealloc to untrack directly before memory free
- Hold both src/dst list locks in promote_survivors to prevent race
conditions with concurrent untrack_object calls
- Add pop_front to LinkedList for freeze/unfreeze operations
Move unreachable_refs creation before drop(gen_locks) so that raw
pointer dereferences and refcount increments happen while generation
list read locks are held. Previously, after dropping read locks, other
threads could untrack and free objects, causing use-after-free when
creating strong references from the raw GcPtr pointers.
* Remove PyMutex<FrameState> from Frame, use UnsafeCell fields directly
Move stack, cells_frees, prev_line out of the mutex-protected FrameState
into Frame as FrameUnsafeCell fields. This eliminates mutex lock/unlock
overhead on every frame execution (with_exec).
Safety relies on the same single-threaded execution guarantee that
FastLocals already uses.
* Add thread-local DataStack for bump-allocating frame data
Introduce DataStack with linked chunks (16KB initial, doubling) and
push/pop bump allocation. Add datastack field to VirtualMachine.
Not yet wired to frame creation.
* Unify FastLocals and BoxVec stack into LocalsPlus
Replace separate FastLocals (Box<[Option<PyObjectRef>]>) and
BoxVec<Option<PyStackRef>> with a single LocalsPlus struct that
stores both in a contiguous Box<[usize]> array. The first
nlocalsplus slots are fastlocals and the rest is the evaluation
stack. Typed access is provided through transmute-based methods.
Remove BoxVec import from frame.rs.
* Use DataStack for LocalsPlus in non-generator function calls
Normal function calls now bump-allocate LocalsPlus data from the
per-thread DataStack instead of a separate heap allocation.
Generator/coroutine frames continue using heap allocation since
they outlive the call.
On frame exit, data is copied to the heap (materialize_to_heap)
to preserve locals for tracebacks, then the DataStack is popped.
VirtualMachine.datastack is wrapped in UnsafeCell for interior
mutability (safe because frame allocation is single-threaded LIFO).
* Fix clippy: import Layout from core::alloc instead of alloc::alloc
* Fix vectorcall compatibility with LocalsPlus API
Update vectorcall dispatch functions to use localsplus stack
accessors instead of direct stack field access. Add
stack_truncate method to LocalsPlus. Update vectorcall fast
path in function.rs to use datastack and fastlocals_mut().
* Add datastack, nlocalsplus, ncells, tstate to cspell dictionary
* Fix DataStack pop() for non-monotonic allocation addresses
Check both bounds of the current chunk when determining if a
pop base is in the current chunk. The previous check (base >=
chunk_start) fails on Windows where newer chunks may be
allocated at lower addresses than older ones.
* Fix stale comments: release_datastack -> materialize_localsplus
* Fix non-threading mode for parallel test execution
Two fixes for Cell-based types used in static items under non-threading
mode, which cause data races when Rust test runner uses parallel threads:
1. LazyLock: use std::sync::LazyLock when std is available instead of
wrapping core::cell::LazyCell with a false `unsafe impl Sync`.
The LazyCell wrapper is kept only for no-std (truly single-threaded).
2. gc_state: use static_cell! (thread-local in non-threading mode)
instead of OnceLock, so each thread gets its own GcState with
Cell-based PyRwLock/PyMutex that are not accessed concurrently.
* Fix CallAllocAndEnterInit to use LocalsPlus stack API
* Use checked arithmetic in LocalsPlus and DataStack allocators
* Address code review: checked arithmetic, threading feature deps, Send gate
- Use checked arithmetic for nlocalsplus in Frame::new
- Add "std" to threading feature dependencies in rustpython-common
- Gate GcState Send impl with #[cfg(feature = "threading")]
* Clean up comments: remove redundant/stale remarks, fix CPython references
* Reinit IO buffer locks after fork to prevent deadlocks
BufferedReader/Writer/TextIOWrapper use PyThreadMutex internally.
If a parent thread held one of these locks during fork(), the child
would deadlock on any IO operation.
Add reinit_after_fork() to RawThreadMutex and call it on sys.stdin/
stdout/stderr in the child process fork handler, analogous to
CPython's _PyIO_Reinit().
* Address review: unsafe fn + decoder lock reinit
- Mark reinit_std_streams_after_fork as unsafe fn to encode
fork-only precondition, update call site in posix.rs
- Reinit IncrementalNewlineDecoder's PyThreadMutex via
TextIOWrapper's decoder field to prevent child deadlocks
* Auto-format: cargo fmt --all
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Use Mutex::raw() accessor in reinit_mutex_after_fork
Use lock_api's Mutex::raw() to access the underlying RawMutex
instead of casting &PyMutex<T> directly. This avoids layout
assumptions about lock_api::Mutex<R, T> field ordering.
* Replace force_unlock with reinit_*_after_fork
Replace all force_unlock() + try_lock() patterns with zero-based
reinit that bypasses parking_lot internals entirely. After fork(),
the child is single-threaded so reinited locks won't contend.
Add reinit_rwlock_after_fork to common::lock alongside the existing
reinit_mutex_after_fork. Replace force_unlock_after_fork methods in
codecs, intern, and gc_state with reinit_after_fork equivalents.
This fixes after_fork_child silently dropping thread handles when
try_lock() failed on per-handle Arc<Mutex> locks.
* Fix _at_fork_reinit to write INIT directly instead of calling unlock()
unlock() goes through unlock_slow() which accesses parking_lot's
global hash table to unpark waiters. After fork(), this hash table
contains stale entries from dead parent threads, making unlock_slow()
unsafe. Writing INIT directly bypasses parking_lot internals entirely.
* Add import lock (IMP_LOCK) reinit after fork
The import lock is a ReentrantMutex that was never reinit'd after
fork(). If a parent thread held it during fork, the child would
deadlock on any import. Only reset if the owner is a dead thread;
if the surviving thread held it, normal unlock still works.
* Relax RefCount atomic ordering from SeqCst to Arc pattern
- inc/inc_by/get: SeqCst → Relaxed
- safe_inc CAS: SeqCst → Relaxed + compare_exchange_weak
- dec: SeqCst → Release + Acquire fence when count drops to 0
- leak CAS: SeqCst → AcqRel/Relaxed + compare_exchange_weak
* Reuse existing Vec via prepend_arg in execute_call
Replace vec![self_val] + extend(args.args) with
FuncArgs::prepend_arg() to avoid a second heap allocation
on every method call.
* Skip downcast_ref checks in invoke when tracing is disabled
Early return in PyCallable::invoke() when use_tracing is false,
avoiding two downcast_ref type checks on every function call.
* Replace fastlocals PyMutex with UnsafeCell-based FastLocals
Eliminate per-instruction mutex lock/unlock overhead for local
variable access. FastLocals uses UnsafeCell with safety guaranteed
by the frame's state mutex and sequential same-thread execution.
Affects 14+ lock() call sites in hot instruction paths (LoadFast,
StoreFast, DeleteFast, and their paired variants).
* Auto-format: cargo fmt --all
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Clear frame locals and stack on generator close
Add Frame::clear_locals_and_stack() to release references held by
closed generators/coroutines, matching _PyFrame_ClearLocals behavior.
Call it from Coro::close() after marking the coroutine as closed.
Update test_generators.py expectedFailure markers accordingly.
* Add dir_fd support for rmdir, remove/unlink, scandir
- rmdir: use unlinkat(fd, path, AT_REMOVEDIR) when dir_fd given
- remove/unlink: use unlinkat(fd, path, 0) when dir_fd given
- scandir: accept fd via fdopendir, add ScandirIteratorFd
- listdir: rewrite fd path to use raw readdir instead of nix::dir::Dir
- DirEntry: add d_type and dir_fd fields for fd-based scandir
- Update supports_fd/supports_dir_fd entries accordingly
* cells_free
* Replace `once_cell` with `std::sync::OnceLock`/`core::cell::OnceCell`
- Replace `once_cell::sync::{Lazy, OnceCell}` with
`std::sync::{LazyLock, OnceLock}`
- Replace `once_cell::unsync::{Lazy, OnceCell}` with
`core::cell::{LazyCell, OnceCell}`
- Inline `get_or_try_init` at call sites (unstable in std as of 1.93)
- Replace `OnceCell::with_value()` with `OnceCell::from()` in codecs.rs
- Remove `once_cell` direct dependency from common and vm crates
* Auto-format: cargo fmt --all
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* `std` feature for common
- Gate OS-dependent modules behind `#[cfg(feature = "std")]`
- Replace `std::f64` with `core::f64` in float_ops
- Replace `std::process::abort` with panic in refcount
- Remove `thread_local` from levenshtein (stack buffer)
- Split static_cell into threading/non_threading/no_std
* `std` for codegen
* `no_std` for pylib
* Replace WeakListInner with inline atomic weakref list and stripe locks
Remove heap-allocated WeakListInner (OncePtr<PyMutex<WeakListInner>>).
WeakRefList now holds two inline atomic pointers (head, generic).
PyWeak.parent replaced with wr_object pointing directly to referent.
Add weakref_lock module with AtomicU8 spinlock array for thread safety.
Rewrite upgrade/clear/drop_inner/count/get_weak_references with stripe lock.
Make Pointers methods public in linked_list.rs.
Remove expectedFailure from test_subclass_refs_dont_replace_standard_refs.
* Auto-format: cargo fmt --all
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* * Added alloc_instead_of_core, std_instead_of_alloc, and std_instead_of_core clippy rules
* Manually changed part of the code to use core/alloc
* use clippy --fix to fix issues in stdlib
* * Used clippy --fix to fix issues in vm
* Imported Range in vm/src/anystr.rs
* * Used clippy --fix to fix issues in common