Skip to content

[mypyc] Add librt.strings.isidentifier codepoint primitive#21522

Open
VaggelisD wants to merge 1 commit into
python:masterfrom
VaggelisD:librt-strings-isidentifier
Open

[mypyc] Add librt.strings.isidentifier codepoint primitive#21522
VaggelisD wants to merge 1 commit into
python:masterfrom
VaggelisD:librt-strings-isidentifier

Conversation

@VaggelisD
Copy link
Copy Markdown
Contributor

5th PR of #21418

True if a codepoint can start a valid identifier (XID_Start, per
PEP 3131). The ASCII fast path covers `[A-Za-z_]` inline; non-ASCII
codepoints round-trip through PyUnicode_FromOrdinal +
PyUnicode_IsIdentifier so the answer matches str.isidentifier on a
1-character string.

The non-ASCII path is the first allocating helper in this series, so
its body lives out-of-line in codepoint_extra_ops.c (it would otherwise
be emitted as a separate copy in every translation unit that includes
the header). On OOM it swallows the exception via PyErr_Clear() and
returns False, which keeps the function ERR_NEVER. Documented at the
call site so callers don't get a surprising silent failure.

Stack: depends on the librt.strings.isspace primitive.
@VaggelisD VaggelisD force-pushed the librt-strings-isidentifier branch from add1d44 to 283b2f8 Compare May 21, 2026 07:42
@github-actions
Copy link
Copy Markdown
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Comment on lines +11 to +16
if (s == NULL) {
// OOM. Swallow and return false to keep the function ERR_NEVER;
// callers expect a defined answer, not a propagated exception.
PyErr_Clear();
return false;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we usually call CPyError_OutOfMemory() in this case to abort, i think it would make sense here as well.

Comment thread mypyc/build.py
Comment on lines +57 to +62
ModDesc(
"librt.strings",
["strings/librt_strings.c", "codepoint_extra_ops.c"],
["codepoint_extra_ops.h"],
["strings"],
),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for not catching this earlier but it looks like the files are dependent on each other the wrong way. the _extra_ops files should be internal to mypyc and shouldn't be needed when building librt modules.

so instead of including codepoint_extra_ops in librt_strings we need it the other way around, with the common functionality in librt_strings. like it's done with the other _extra_ops files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants