Skip to content

Respect integer64= selection when out-of-sample type bumps happen in fread()#7032

Merged
aitap merged 9 commits intomasterfrom
i64remapfix
Jun 16, 2025
Merged

Respect integer64= selection when out-of-sample type bumps happen in fread()#7032
aitap merged 9 commits intomasterfrom
i64remapfix

Conversation

@badasahog
Copy link
Copy Markdown
Contributor

@badasahog badasahog commented May 28, 2025

Closes #2749

I also changed while(true) to for(;;) for consistency. Technically should've been its own pr, hope you don't mind.

@codecov
Copy link
Copy Markdown

codecov bot commented May 28, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.69%. Comparing base (2642b84) to head (ecda6f8).
Report is 6 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7032   +/-   ##
=======================================
  Coverage   98.69%   98.69%           
=======================================
  Files          79       79           
  Lines       14676    14677    +1     
=======================================
+ Hits        14485    14486    +1     
  Misses        191      191           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown

github-actions bot commented May 28, 2025

  • HEAD=i64remapfix stopped early for fread disk overhead improved in #6925
    Comparison Plot

Generated via commit ecda6f8

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 2 minutes and 42 seconds
Installing different package versions 41 seconds
Running and plotting the test cases 2 minutes and 2 seconds

Copy link
Copy Markdown
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some test(s) in inst/tests/tests.Rraw, and be sure to write a NEWS entry.

Also please make note of the tiny tweak to the PR title / description -- mostly this is about respecting quirks of GitHub itself, which doesn't support #NNN issue linking in titles, and does have a nice integration where Closes #NNN will automatically close the issue when the PR is merged.

@oleksiyskononenko are you able to confirm if pydatatable is still maintained / whether this change to readInt64As affects that implementation?

@MichaelChirico MichaelChirico changed the title fixed #2749 Respect integer64= selection when out-of-sample type bumps happen in fread() May 28, 2025
@badasahog
Copy link
Copy Markdown
Contributor Author

I'm sorry, I don't know how to add tests, I'm not familiar with the testing system.

@badasahog
Copy link
Copy Markdown
Contributor Author

Also, it looks like there are 3 different news files? Which one should I use?

@MichaelChirico
Copy link
Copy Markdown
Member

NEWS.x are archival files; NEWS.md is the "current" one, to which new entries should be added.

The structure of the tests file is pretty simple -- just add new test() entries, appropriately numbered, at the end.

Here's the documentation of test():

https://rdatatable.gitlab.io/data.table/reference/test.html

You might search around for other tests of out-of-sample type bumps, and piggy-back off of those. It's acceptable to, e.g., find test(100.1) that's closely related to your new behavior and insert a new test 100.2 in the middle of the file -- the simplest approach, also perfectly acceptable, is to just append your new tests to the end.

@oleksiyskononenko
Copy link
Copy Markdown

@MichaelChirico I'm not involved in datatable development anymore, so can't confirm. However, it doesn't share any code with data.table for a long time, so your changes to fread() wouldn't affect anything on the python side.

@badasahog badasahog requested a review from MichaelChirico May 29, 2025 11:44
@MichaelChirico
Copy link
Copy Markdown
Member

Hi @badasahog I see a NEWS entry, but no tests yet. Let me know if you're stuck.

@badasahog
Copy link
Copy Markdown
Contributor Author

Sorry for the late response. I don't know how to make tests, I'm not proficient in R, and I don't know how the testing system works.

@MichaelChirico
Copy link
Copy Markdown
Member

Writing tests is a required part of OSS contribution. By not writing tests, you're burdening other contributors/the maintainers with doing so on your behalf.

I pointed above to the documentation to test(). The README for cc() might help more -- it provides some simple steps for iterating through tests locally:

https://github.com/Rdatatable/data.table/tree/master/.dev

You should also feel free to just use the GitHub CI suite -- make changes to tests, push to GitHub, check for GHA failures, iterate.

If you're stuck somewhere, we are happy to help, but simply opting out of writing tests won't work.

Comment thread src/fread.c Outdated
Comment thread inst/tests/tests.Rraw Outdated
Comment on lines +3073 to +3083
{
dt0 = data.table(a=seq(10000), b="100")
dt0[111, b := "1000000000000"]
f = tempfile()
fwrite(dt0, f)

test(1017.3, fread(f, integer64="numeric"), fread(f, colClasses=c("integer", "numeric")))
test(1017.4, fread(f, integer64="character"), fread(f, colClasses=c("integer", "character")))

unlink(f)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test, thank you! It fails for me on master and passes on the PR branch. One thing: since R doesn't have variable declarations, the { braces } don't create a new scope; any variables created inside them are fully visible in the outer environment. (The only thing they do is combine multiple expressions, returning the value of the last expression in the group.) The code might as well not create an indentation level at all.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do want to enforce a new scope, you can use local({ ... }). Though we don't really do that in our tests (we should, but that's another issue)

@badasahog badasahog requested a review from aitap June 6, 2025 08:30
Copy link
Copy Markdown
Member

@aitap aitap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing the bug!

@aitap aitap merged commit 05d6797 into master Jun 16, 2025
11 checks passed
@aitap aitap deleted the i64remapfix branch June 16, 2025 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

in fread integer64 argument is ignored if int64 type-bump occurs out-of-sample

4 participants