Skip to content

Conversation

@SwayamInSync
Copy link
Member

@SwayamInSync SwayamInSync commented Dec 19, 2025

As per the title

closes #224

@SwayamInSync SwayamInSync added this to the v1.0 milestone Dec 19, 2025
@SwayamInSync SwayamInSync marked this pull request as draft December 19, 2025 15:15
@SwayamInSync SwayamInSync marked this pull request as ready for review December 19, 2025 15:20
@SwayamInSync
Copy link
Member Author

Tests contribute the major part of diff

}

quad_value out_val;
if (bytes_to_quad_convert(s.buf, s.size, backend, &out_val) < 0) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utilising bytes_to_quad_convert instead of unicode_to_quad_convert because the later expects Py_UCS format otherwise both are doing the same thing

Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted some minor issues and one bigger opportunity to avoid unnecessarily creating python strings.

https://github.com/numpy/numpy-user-dtypes/pull/244/changes#r2636204832 is a bigger comment; don't consider it a blocker if you disagree or don't want to spend time on it.

npy_intp *view_offset)
{
Py_INCREF(given_descrs[0]);
loop_descrs[0] = given_descrs[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do this after checking given_descrs[1] then there's no need to decref in the error paths below so it'll be a little clearer

loop_descrs[1] = given_descrs[1];
}

// no notion of fix length, so always unsafe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just delete this comment, I don't think it's correct. It's unsafe because arbitrary strings aren't generally convertible losslessly to quads.

npy_intp *view_offset)
{
Py_INCREF(given_descrs[0]);
loop_descrs[0] = given_descrs[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// Get string representation with adaptive notation
// Use a large buffer size to allow for full precision
PyObject *py_str = quad_to_string_adaptive(&sleef_val, QUAD_STR_WIDTH);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to create a PyUnicode object here. Just pass the ASCII bytes of the C parsed string to NpyString_Pack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So actually this function quad_to_string_adaptive uses Dragon4 utilities like Dragon4_Positional_QuadDType and Dragon4_Scientific_QuadDType for conversion and they both returns a PyUnicode_FromString object and from them we extract the cstring from it.

I can modify the dragon4 helper or add one more helper with _cstr suffix that returns cstring and then this would be doable. Let me know if this sounds good?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there should be a code path that bypasses creating a Python string.

@SwayamInSync
Copy link
Member Author

  • dragon4.c/.h only contains some duplicate helpers that return cstring instead of PyUnicode object
  • casts.cpp got another duplicate helper quad_to_string_adaptive_cstr that calls corresponding dragon4 utilities to return cstring
  • tests are consolidated into others

@SwayamInSync
Copy link
Member Author

Cool, these changes should address all the comments

@juntyr
Copy link
Contributor

juntyr commented Dec 22, 2025

In quad to adaptive cstr, we could remove the else here to make the early return in the if branch clearer

@juntyr
Copy link
Contributor

juntyr commented Dec 22, 2025

Why does the return of _bigint_static.repr do?

@SwayamInSync
Copy link
Member Author

Why does the return of _bigint_static.repr do?

_bigint_static is a thread-local static struct of type Dragon4_Scratch and it's repr field is a 16KB character buffer used to store the string representation of a floating-point number during conversion. So when we call the dragon4 functions to give us scientific or positional representation, the internal helpers does all the heavy liftings and populate that repr field so we just return it.

I know might look odd as playing with a global variable (but it is thread-local) and maybe call-by-reference can be a clean approach.

@SwayamInSync
Copy link
Member Author

In quad to adaptive cstr, we could remove the else here to make the early return in the if branch clearer

right, will do

@juntyr
Copy link
Contributor

juntyr commented Dec 22, 2025

Why does the return of _bigint_static.repr do?

_bigint_static is a thread-local static struct of type Dragon4_Scratch and it's repr field is a 16KB character buffer used to store the string representation of a floating-point number during conversion. So when we call the dragon4 functions to give us scientific or positional representation, the internal helpers does all the heavy liftings and populate that repr field so we just return it.

I know might look odd as playing with a global variable (but it is thread-local) and maybe call-by-reference can be a clean approach.

Thanks for the explanation, could you add a short comment in the code for this?

Copy link
Contributor

@juntyr juntyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my end

@SwayamInSync
Copy link
Member Author

Cool merging this in!
Thanks everyone

@SwayamInSync SwayamInSync merged commit 5f04fe1 into numpy:main Dec 22, 2025
11 checks passed
@SwayamInSync SwayamInSync deleted the strdtype branch December 22, 2025 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Array Casting Support for fixed-length String dtypes

3 participants