start refactoring process by setting up base + init#14306
start refactoring process by setting up base + init#14306logan-keede wants to merge 11 commits intoapache:mainfrom
Conversation
|
cc @Rachelint |
|
It seems the tests will be executed twice, how about we just left the one complete test file? Because we will only move testcases incrementlly after this pr, seems we can ensure no cases are lost by this way:
And it seems great if we can make this an automatic process? |
There was a problem hiding this comment.
How about we name it old_aggregate.slt or old_testcases.slt.
And we can add a README to explain the background like what in string.
(I can help, and not a required thing about merging)
There was a problem hiding this comment.
It seems the tests will be executed twice, how about we just left the one complete test file?
Because we will only move testcases incrementlly after this pr, seems we can ensure no cases are lost by this way:
- move cases from
complete_aggregate.slttofunction1.slt- get
diffbetween current movedcomplete_aggregate.slt- compare
diffandfunction1.sltAnd it seems great if we can make this an automatic process?
If I understood you correctly,
My fear is that we might lose track of what we have already moved,
we might not be able to make sure that sum of all funtions_*.slt is equal to old_aggregate.slt or not, but for base_aggregate.slt we know if something is present in it, it is not present in any of the functions file.
beside I can not think of test running twice as a bad thing, it is like an extra layer of security at the cost of ~5 sec of ci time(even on 1 thread).
I can maintain this extra file on my local system but that is like binding this issue to me, it will be easier for anyone to contribute in spliting this file if we keep both.
I definitely agree with making a README file, I was considering it myself.
There was a problem hiding this comment.
Agree with it is most improtant to keep no tests are lost and I think it ok to execute tests twice temporarily.
But I think it a bit strange if we always need to execute tests in later.
If we choose to keep and run all exists cases, it seems good that?
- Just don't modify the old cases in
aggregate.slt, but only rename it toold_aggregate.slt - And we ensure no new cases will be added into
old_aggregate.sltanymore, and guide contributors to add cases in the new way(create a file for the function, and add cases into it)
There was a problem hiding this comment.
🤔 The alternative may can be following? Maybe actually make sense that we should keep a complete for ensuring no cases lost.
-
Keep the
complete_aggregate.sltbut just make it won't be executed -
Perform extract and subtract for the
base_aggregate.slt.
For example we extractmin/maxfrom it, and subtract them frombase_aggregate.slt.
And getmin_max.sltandbase_aggregate.slt -
Implement a simple program/script to check if
min_max.slt+base_aggregate.slt=complete_aggregate.slt
It seems not only aggregate and string but also some other test files are too large, may be it can reused during sorting out them?
There was a problem hiding this comment.
- Keep the
complete_aggregate.sltbut just make it won't be executed- Perform extract and subtract for the
base_aggregate.slt.
This approach looks good to me.
- Implement a simple program/script to check if
min_max.slt+base_aggregate.slt=complete_aggregate.slt
This looks fun to me, I will be working on this though it might take some time so, can we merge this portion in a separate PR.
There was a problem hiding this comment.
Yes, it is nice to do it in follow on prs.
There was a problem hiding this comment.
should I open a new issue for this or just a PR?
There was a problem hiding this comment.
should I open a new issue for this or just a PR?
I think both of them are ok?
There was a problem hiding this comment.
Maybe open a sub issue of #13723 ?
And we state what we want to do in later refactor in it?
Rachelint
left a comment
There was a problem hiding this comment.
Thanks @logan-keede again, it looks good to me as a start of refactoring!
fix: typo Co-authored-by: kamille <3144148605@qq.com>
There was a problem hiding this comment.
Oh, sorry... I think of some situations:
- Contributors don't notice the README, and add new tests into
base_aggregate.slt - Reviewers don't notice the README too, approve and merge the pr
- Finally, the
base_aggregate.sltbecome different with the archivedcomplete_aggregate.slt
And it may be painful to solve such conflicts if it happen frequently.
Maybe we should include the check in this pr before merging. And when found new cases added into base_aggregate.slt, we throw an error and block it in ci.
Sorry again...
…n-keede/datafusion into diff_for_sqllogictests
No problem, I have added the diff function. |
|
@Rachelint I have added the test to CI, |
|
Thanks @logan-keede , I will review it in next few days. |
|
@Rachelint this is just a reminder. Please disregard if this isn't needed. |
Sorry, I am back and reviewing today. |
|
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
Which issue does this PR close?
aggregate.slt#13723Rationale for this change
refer to #14301
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?