RustPython

mirror of https://github.com/RustPython/RustPython.git synced 2026-06-02 19:39:49 +09:00

Author	SHA1	Message	Date
Chanho Lee	1cb24c5ebb	Reject non-ASCII digits in JSON numbers (#7982 )	2026-05-27 16:40:52 +09:00
Changjoon	fb1218d6ba	Accept surrogates in _json.JsonScanner decode path (#7675 ) The _json decoder had two failure modes when a Python str value would contain a lone surrogate (legal per the Python 3 str model): 1. Boundary UnicodeEncodeError: JsonScanner::Callable::call rejected any input str with surrogates via try_into_utf8 before scanning began. 2. Silent U+FFFD corruption: call_scan_once and parse_object's key path called .to_string() on scanstring's Wtf8Buf output, which routes through Wtf8::Display (lossy). Array values and dict keys decoded from JSON \uXXXX escapes silently became U+FFFD. Switch JsonScanner's five PyUtf8StrRef signatures to PyStrRef, drop the entry-point try_into_utf8 call, and feed Wtf8Buf directly to new_str instead of going through .to_string(). Key memoization now uses HashMap<Wtf8Buf, PyStrRef> so surrogate-bearing keys survive interning. parse_number takes &[u8] since JSON numbers are ASCII. Extends the WTF-8 refactor pattern established in #7673 to the decoder. machinery::scanstring already returns Wtf8Buf and is unchanged. Unmasks test_single_surrogate_decode. 214 tests in test.test_json pass with no regressions. Decoder output verified byte-identical to CPython 3.13.4 over 10,000 random fuzz cases (JSON docs containing random surrogate escapes at root/list/dict positions, compared via json.dumps(..., ensure_ascii=True, sort_keys=True)).	2026-04-25 05:16:12 +09:00
Changjoon	2e5c2be7fa	Accept surrogates in _json.encode_basestring{,_ascii} (#7673 ) encode_basestring/encode_basestring_ascii took PyUtf8StrRef, so json.dumps(str_with_lone_surrogate) raised UnicodeEncodeError at the Python/Rust boundary before write_json_string ran. CPython's encoder emits \uXXXX under ensure_ascii=True and passes raw WTF-8 otherwise. Switch to PyStrRef + s.as_wtf8(), matching scanstring in the same file. Rewrite write_json_string to accept &Wtf8 and iterate code_point_indices, emitting \uXXXX for surrogates in ascii mode and passing their bytes through otherwise. Stop escaping 0x7F in the ensure_ascii=False path (matches py_encode_basestring). Return Wtf8Buf via the checked from_bytes so invariant breaks panic instead of UB. Fuzzing also exposed two pre-existing ESCAPE_CHARS typos: 0x0B was "\u000" and 0x1B was "\u001" (both missing trailing 'b'). Fixed here. Verified byte-identical with CPython 3.13.4 over 16 manual + 10,000 random fuzz cases. Full test.test_json: 214 tests, 0 failures, 0 unexpected successes. Unmasks test_ascii_non_printable_encode and test_single_surrogate_encode. Decoder path is a follow-up.	2026-04-25 00:08:14 +09:00
Changjoon	175f12b664	Fix stack overflow on deeply-nested JSON in json.loads() (#7632 ) * Fix stack overflow on deeply-nested JSON in json.loads() json.loads() on a deeply-nested array or object payload (e.g. '[' * 50000 + ']' * 50000) overflowed the native Rust stack and crashed the interpreter process with SIGSEGV. CPython raises RecursionError on the same input via _Py_EnterRecursiveCall in Modules/_json.c. The recursion lives in the mutual call chain: JsonScanner::parse_object / parse_array -> JsonScanner::call_scan_once -> JsonScanner::parse_object / parse_array Every descent funnels through call_scan_once, so wrapping its body with vm.with_recursion covers both '{' and '[' paths (and their mixed nesting) with a single guard. Before: ./rustpython -c "import json; json.loads('[' * 50000 + ']' * 50000)" -> SIGSEGV (exit 139) After: -> RecursionError: maximum recursion depth exceeded while decoding a JSON object from a string Verified: - extra_tests/snippets/stdlib_json.py: all assertions pass (includes 3 new regression cases: array, object, alternating nesting at depth 100000) - cargo run -- -m test test_json: 214 passed, 0 regressed (9 skipped, 13 expected failures, all pre-existing) - depth 500000 no longer crashes (RecursionError) - shallow parsing unchanged * Enable test_highly_nested_objects_decoding Per @ShaharNaveh's review on #7632: this test was previously marked `@unittest.skip("TODO: RUSTPYTHON; crashes")` because json.loads would SIGSEGV on the 500_000-deep input. The recursion-guard added in this PR makes it raise RecursionError like CPython, so the skip decorator can be removed. $ cargo run -- -m unittest \ test.test_json.test_recursion.TestCRecursion.test_highly_nested_objects_decoding \ test.test_json.test_recursion.TestPyRecursion.test_highly_nested_objects_decoding ... Ran 2 tests in 0.825s OK $ cargo run -- -m test test_json Ran 214 tests (7 skipped, 13 expected failures) — all pass.	2026-04-20 21:52:17 +09:00
ShaharNaveh	f73df6a102	Update `test_json` from 3.14.3	2026-02-10 21:00:40 +09:00
Jeong, YunWon	100b870175	Implement UTF-32 encode/decode and fix UTF-16 empty encode - Add UTF-32, UTF-32-LE, UTF-32-BE encode/decode in _pycodecs.py - Register utf_32 codec functions in codecs.rs via delegate_pycodecs - Fix PyUnicode_EncodeUTF16 returning "" instead of [] for empty input - Remove resolved expectedFailure decorators in test_codecs.py - Add failure reasons to remaining expectedFailure comments	2026-02-02 12:50:34 +09:00
Jeong, YunWon	8f7b1343bc	mark and unmark successful/failing tests	2026-01-18 20:00:15 +09:00
Lee Dogeon	5242ff5243	Bump json to 3.14.2 (#6774 )	2026-01-18 19:16:48 +09:00
Lee Dogeon	ef871d227e	Update json module to 3.13.11 (#6743 )	2026-01-16 21:38:15 +09:00
Lee Dogeon	3a702ac772	Improve `json.loads` performance (#6704 ) * Parse JSON in Rust * Reuse key when decoding JSON * Unmark resolved test * Parse null/true/false directly in call_scan_once Parse JSON constants (null, true, false) directly in Rust within call_scan_once() instead of falling back to Python scan_once. This reduces Python-Rust boundary crossings for array/object values. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Parse numbers directly in call_scan_once Parse JSON numbers starting with digits (0-9) directly in Rust within call_scan_once() by reusing the existing parse_number() method. This reduces Python-Rust boundary crossings for array/object values. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Parse NaN/Infinity/-Infinity in call_scan_once Parse special JSON constants (NaN, Infinity, -Infinity) and negative numbers directly in Rust within call_scan_once(). This handles: - 'N' -> NaN via parse_constant callback - 'I' -> Infinity via parse_constant callback - '-' -> -Infinity or negative numbers via parse_constant/parse_number This reduces Python-Rust boundary crossings for array/object values. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Correct wrong index access * Leave more flame span * Refactor json scanstring with byte index --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-15 19:53:46 +09:00
Shahar Naveh	ceb7046bc4	Fix int respect `sys.set_int_max_str_digits` value (#6094 )	2025-08-21 13:14:10 +09:00
ShaharNaveh	a5b240aab8	skip crashing test	2025-07-25 16:13:07 +02:00
Shahar Naveh	c497061290	Update `json` from 3.13.5 (#6007 ) * Update `json` from 3.13.5 * Update `test_json` from 3.13.5	2025-07-20 18:44:46 +09:00
Noa	0a07cd931f	Fix more surrogate crashes	2025-03-26 23:12:21 -05:00
Noa	a86126419c	Fix remaining tests	2025-03-25 19:05:12 -05:00
Jeong Yunwon	2f4000b239	mark faiing tests from test_json	2022-07-19 01:33:15 +09:00
CPython developers	413e8250f0	Update {test_}json from CPython 3.10.5	2022-07-19 01:33:15 +09:00
Dean Li	6f98288e84	test: use import_helper	2021-11-29 21:03:02 +08:00
Dean Li	5ee4fb899b	test: use os_helper	2021-11-28 20:51:32 +08:00
Jeong YunWon	913b78ca44	Revert "Merge pull request #3433 from deantvv/test-update" This reverts commit `9fa5c5ac66`, reversing changes made to `e7fa32c687`.	2021-11-17 17:06:51 +09:00
Dean Li	49a5805d11	test: use os_helper	2021-11-13 02:18:33 +00:00
Padraic Fanning	e5acfc3a67	Clean up skip in test_json.test_speedups	2021-10-18 22:09:36 -04:00
Padraic Fanning	05f3ef557b	Clean up skip in test_json.test_decode	2021-10-18 22:09:17 -04:00
Jeong YunWon	22322fafe7	Merge pull request #2506 from fanninpm/more-expected-failures Unskip more tests (follow-up from #2443)	2021-02-28 18:34:43 +09:00
Padraic Fanning	f1152a345c	Unskip test(s) in test_json.test_unicode	2021-02-25 21:41:02 -05:00
Padraic Fanning	63c3a306c4	Unskip tests in test_json.test_scanstring	2021-02-25 21:39:08 -05:00
Padraic Fanning	4a485c2c70	Unskip test(s) in test_json.test_fail	2021-02-25 21:35:13 -05:00
Noah	491c4e775b	Fix json.scanstring unicode handling	2021-02-20 21:04:30 -06:00
Padraic Fanning	6a21d3ce3d	Explain test_bytes_decode skip	2021-02-07 15:49:42 -05:00
Padraic Fanning	a5bc2bb909	Explain test_overflow skip	2021-02-07 15:49:42 -05:00
Padraic Fanning	961472e6fe	Explain test_bad_escapes skip	2021-02-07 15:49:42 -05:00
Padraic Fanning	c283f64a96	Explain test_surrogates skip	2021-02-07 15:49:42 -05:00
Padraic Fanning	d696eac3a7	Unskip test_truncated_input	2021-02-07 15:49:42 -05:00
Padraic Fanning	c22ecc7347	Explain test_failures skip	2021-02-07 15:49:42 -05:00
Noah	1f4f407d5d	Implement json.decoder.scanstring in Rust	2020-10-04 13:04:43 -05:00
Noah	0876c19c04	Unskip test_json.test_tool	2020-08-03 13:20:06 -05:00
Noah	b1aa11bf9e	Uncomment some things that were dependent on proper subprocess	2020-06-21 16:47:41 -05:00
Noah	0fb79e1086	Implement _json.encode_basestring{,_ascii}	2020-06-06 15:33:29 -05:00
Noah	84b71c9563	Enable doctest in test_json	2020-05-05 12:23:58 -05:00
Noah	316ee37b38	Mark unsupported tests for _json	2020-04-28 13:45:53 -05:00
Noah	d92cebd953	Unskip tests that depend on \N	2020-04-14 13:06:32 -05:00
Noah	95d12d02ae	Mark failing tests for test_json	2020-04-08 12:26:28 -05:00
Noah	9d136d6450	Add test.test_json from CPython 3.8.2	2020-04-08 12:16:19 -05:00

43 Commits