Files
RustPython/extra_tests/snippets/stdlib_struct.py
Changjoon be43bb6dbf Match CPython error type for non-ASCII struct format arguments (#7681)
* Match CPython error type for non-ASCII struct format arguments

Struct() raised the wrong exception type when the format argument
contained non-ASCII characters:
- str input with non-ASCII char: RustPython raised UnicodeDecodeError
  with an empty message; CPython raises UnicodeEncodeError as if
  format.encode('ascii') had been called directly.
- bytes input with non-ASCII byte: same wrong UnicodeDecodeError;
  CPython passes the bytes through to the format parser, which then
  errors with struct.error("bad char in struct format").

Restructure IntoStructFormatBytes::try_from_object to:
- raise UnicodeEncodeError("ascii", s, start, start+1, "ordinal not
  in range(128)") for non-ASCII str, with start computed as the
  first non-ASCII code point position (matching CPython's natural
  encoding-error format);
- raise struct.error("bad char in struct format") for non-ASCII bytes,
  produced via the existing new_struct_error helper.

Probed byte-identical with CPython 3.14.4 for both cases. Full
test.test_struct (43 tests) passes with no regressions. Sanity-tested
all standard format/pack/unpack/calcsize call shapes remain unchanged.

* Add regression test for non-ASCII format string error types

* Use raise AssertionError instead of assert False (B011)
2026-04-26 12:31:54 +09:00

95 lines
2.0 KiB
Python

import struct
from testutils import assert_raises
data = struct.pack("IH", 14, 12)
assert data == bytes([14, 0, 0, 0, 12, 0])
v1, v2 = struct.unpack("IH", data)
assert v1 == 14
assert v2 == 12
data = struct.pack("<IH", 14, 12)
assert data == bytes([14, 0, 0, 0, 12, 0])
v1, v2 = struct.unpack("<IH", data)
assert v1 == 14
assert v2 == 12
data = struct.pack(">IH", 14, 12)
assert data == bytes([0, 0, 0, 14, 0, 12])
v1, v2 = struct.unpack(">IH", data)
assert v1 == 14
assert v2 == 12
data = struct.pack("3B", 65, 66, 67)
assert data == bytes([65, 66, 67])
v1, v2, v3 = struct.unpack("3B", data)
assert v1 == 65
assert v2 == 66
assert v3 == 67
with assert_raises(Exception):
data = struct.pack("B0B", 65, 66)
with assert_raises(Exception):
data = struct.pack("B2B", 65, 66)
data = struct.pack("B1B", 65, 66)
with assert_raises(Exception):
struct.pack("<IH", "14", 12)
assert struct.calcsize("B") == 1
# assert struct.calcsize("<L4B") == 12
assert struct.Struct("3B").pack(65, 66, 67) == bytes([65, 66, 67])
class Indexable(object):
def __init__(self, value):
self._value = value
def __index__(self):
return self._value
data = struct.pack("B", Indexable(65))
assert data == bytes([65])
data = struct.pack("5s", b"test1")
assert data == b"test1"
data = struct.pack("3s", b"test2")
assert data == b"tes"
data = struct.pack("7s", b"test3")
assert data == b"test3\0\0"
data = struct.pack("?", True)
assert data == b"\1"
data = struct.pack("?", [])
assert data == b"\0"
assert struct.error.__module__ == "struct"
assert struct.error.__name__ == "error"
# Non-ASCII format string: error type matches CPython.
# str → UnicodeEncodeError (encoding='ascii')
# bytes → struct.error
try:
struct.Struct("\udc00")
except UnicodeEncodeError as e:
assert e.encoding == "ascii"
else:
raise AssertionError("expected UnicodeEncodeError")
with assert_raises(UnicodeEncodeError):
struct.Struct("")
with assert_raises(struct.error):
struct.Struct(b"\xff")