Skip to content

Improve testing of read datasets#61

Merged
vnmabus merged 4 commits intodevelopfrom
feature/tests_multiple
Apr 20, 2026
Merged

Improve testing of read datasets#61
vnmabus merged 4 commits intodevelopfrom
feature/tests_multiple

Conversation

@vnmabus
Copy link
Copy Markdown
Owner

@vnmabus vnmabus commented Apr 18, 2026

Describe the proposed changes

We now use pytest plus some extra magic to make our read tests more reproducible.
Now, the code to generate the datasets (in R) is defined in the docstrings, and the datasets can be (re-)generated passing the argument --generate-datasets to pytest.
The tests will automatically be executed with all possible combinations of the R save format, including:

  • Versions 2 or 3
  • RDS or RDA
  • XDR, ASCII and binary encodings

This should improve our coverage and ensure that we do not miss some features for some combination of these parameters.

NOTE: Apparently binary and RDA cannot be combined, so that combination is skipped.

Checklist before requesting a review

  • I have performed a self-review of my code
  • The code conforms to the style used in this package (checked with Ruff)
  • The code is fully documented and typed (type-checked with Mypy)
  • I have added thorough tests for the new/changed functionality

vnmabus added 3 commits April 4, 2026 11:42
Integers in R are always 32-bits, but the integer sequence altrep was
being unpacked as an array of 64-bit integers.

This has been detected when comparing tests for R format versions 2
(that has no altreps) and version 3 (with altreps).
We now use pytest plus some extra magic to make our read tests more
reproducible.
Now, the code to generate the datasets (in R) is defined in the
docstrings, and the datasets can be (re-)generated passing the argument
`--generate-datasets` to pytest.
The tests will automatically be executed with all possible combinations
of the R save format, including:
  - Versions 2 or 3
  - RDS or RDA
  - XDR, ASCII and binary encodings

This should improve our coverage and ensure that we do not miss some
features for some combination of these parameters.

NOTE: Apparently binary and RDA cannot be combined, so that combination
is skipped.
The generated tests have been added, so that an R installation is not
required for checking that everything works.

Removed the old versions of the datasets.

NOTE: Binary datasets are by their nature machine-dependent. We upload
the datasets for little-endian machines, as these are the most common
(and the ones used by Github pipelines). Users in a big-endian
environment would need to test with the `--generate-datasets` option
if they want to check the datasets generated by their machine: no
attempt will be made to generate also big-endian datasets as I do not
have access to a big-endian machine. I only kept the simple test for
checking that big-endian works.
@vnmabus
Copy link
Copy Markdown
Owner Author

vnmabus commented Apr 18, 2026

@traversc I improved the tests so that they are now tried with all possible format combinations. However, I do not think it is possible to save in R an object in the RDA format with native binary encoding. I noticed that you actually provided a test to check that combination. Is there a way to generate such a dataset that I do not know about?

Now that we have write tests too, the old name feels inappropriate.
@traversc
Copy link
Copy Markdown
Contributor

No normal way but it is officially part of the R serialization spec: https://github.com/wch/r-source/blob/c1a5096885cb8c55a1df78d660e0828f83bc68f9/doc/NEWS.3#L1546

Someone could write a custom function that writes it and R will properly load, so I feel that it should still be supported.

@vnmabus
Copy link
Copy Markdown
Owner Author

vnmabus commented Apr 20, 2026

Ok, then that part would only be tested with your handcrafted example, which I think is enough. If R adds support for writing binary RDA files in the future we can then stop skipping that combination.

@vnmabus vnmabus merged commit a88194b into develop Apr 20, 2026
22 checks passed
@vnmabus vnmabus deleted the feature/tests_multiple branch April 20, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants