Skip to content

Conversation

@remyoudompheng
Copy link
Contributor

@remyoudompheng remyoudompheng commented Sep 7, 2025

This is a proposal to handle issue #159 for Python versions below 3.14 (for example Ubuntu 24.04 LTS uses Python 3.12).
The idea is to use standard int methods (from_bytes, to_bytes) to obtain a binary serialization, then cast the byte array to ulong.

Special care is needed for big-endian platforms, however I am unable to test whether the patch proposal is correct for big-endian platforms. Let me know what would be the preferred approach.

Benchmarks done using %timeit in iPython with Python 3.13
AFAIK gmpy2 uses private Python stuff already so it might be difficult to do better.

Number gmpy2 int(mpz) gmpy2 mpz(int) int(fmpz) 0.8.0 fmpz(int) 0.8.0 int(fmpz) PR fmpz(int) PR
3^100 45.7 ns 53.6 ns 186 ns 234 ns 138 ns 166 ns
3^10000 1.49 µs 1.23 µs 6.63 µs 6.52 µs 1.15 µs 1.06 µs
-3^10000 1.50 µs 1.23 µs 6.63 µs 6.60 µs 1.24 µs 1.23 µs

(edited for changes in commit f349183 for negative numbers)

@remyoudompheng
Copy link
Contributor Author

The function ulong_from_little_endian is endian-agnostic and was tested by forcing is_big_endian=1 on a little-endian system.

@remyoudompheng
Copy link
Contributor Author

remyoudompheng commented Sep 7, 2025

Since this PR seems to give performance on par with gmpy2 it is unclear whether PEP 757 will be even necessary. I have not been able to build #64 for comparison

Also, it should work with older Python versions.

@oscarbenjamin
Copy link
Collaborator

The PEP 757 interface is provided as CPython's best effort to give something efficient for this using public API in Python 3.14 onwards. I think that should be the baseline before trying any other approach.

@skirpichev
Copy link

Interesting, conversion to int seems asymptotically faster with this approach (c.f. using mpz_export). Perhaps, it could be a little better with new PyLong_*Bytes*() C-API. Below my benchmarks. I would appreciate if someone could repeat this on less noisy system.

AFAIK gmpy2 uses private Python stuff already so it might be difficult to do better.

No, gmpy2 uses PEP 757 API (using pythoncapi-compat for <3.14).

Import (int -> mpz)

Benchmark gmpy2.mpz flint.fmpz PR324
1<<7 321 ns 475 ns: 1.48x slower 443 ns: 1.38x slower
1<<38 340 ns 471 ns: 1.39x slower 461 ns: 1.36x slower
1<<300 611 ns 3.69 us: 6.04x slower 2.09 us: 3.41x slower
1<<3000 2.58 us 10.3 us: 4.01x slower 3.21 us: 1.25x slower
1<<10000 7.61 us 30.1 us: 3.95x slower not significant
Geometric mean (ref) 2.87x slower 1.51x slower

Export (mpz -> int)

Benchmark gmpy2.mpz flint.fmpz PR324
1<<7 158 ns 177 ns: 1.12x slower 173 ns: 1.09x slower
1<<38 206 ns 231 ns: 1.12x slower 226 ns: 1.09x slower
1<<300 617 ns 2.52 us: 4.08x slower 932 ns: 1.51x slower
1<<3000 2.54 us 10.5 us: 4.14x slower 2.33 us: 1.09x faster
1<<10000 7.61 us 31.3 us: 4.12x slower 6.26 us: 1.22x faster
Geometric mean (ref) 2.44x slower 1.06x slower

Could someone trigger CI tests in this pr (build logs are expired)?

benchmark scripts
# bench-import.py

import os

import pyperf

_T = os.getenv('_T')
if _T == "gmpy2.mpz":
    from gmpy2 import mpz
elif _T == "gmp.mpz":
    from gmp import mpz
else:
    from flint import fmpz as mpz

cases = ['1<<7', '1<<38', '1<<300', '1<<3000', '1<<10000']
runner = pyperf.Runner()
for c in cases:
    i = eval(c)
    runner.bench_func(c, mpz, i)
# bench-export.py

import os

import pyperf

_T = os.getenv('_T')
if _T == "gmpy2.mpz":
    from gmpy2 import mpz
elif _T == "gmp.mpz":
    from gmp import mpz
else:
    from flint import fmpz as mpz

cases = ['1<<7', '1<<38', '1<<300', '1<<3000', '1<<10000']
runner = pyperf.Runner()
for c in cases:
    i = eval(c)
    m = mpz(i)
    runner.bench_func(c, int, m)

@oscarbenjamin
Copy link
Collaborator

gmpy2 uses PEP 757 API (using pythoncapi-compat for <3.14).

I think that python-flint should do the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants