-
Notifications
You must be signed in to change notification settings - Fork 82
partbyenum on int column #697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Unit testsExisting tests New tests |
Int partition limitsThe initial plan was to allow all partition values right up to the max long integer value Clamping at the max int value // Function to splay a table with one integer value in an int partition with the same value
q)makepar:{(` sv .Q.par[`:.;x;`testtab],`) set ([]expint:enlist x)}
// Make and load an int partitioned database from long integer values
// Values are in and around the min and max non-negative short, int, and long values
q)system "mkdir longdb"
q)system "cd longdb"
q)string makepar each 0 1 32766 32767 32768 2147483646 2147483647 2147483648 9223372036854775805 9223372036854775806 9223372036854775807
":./0/testtab/"
":./1/testtab/"
":./32766/testtab/"
":./32767/testtab/"
":./32768/testtab/"
":./2147483646/testtab/"
":./2147483647/testtab/"
":./2147483648/testtab/"
":./9223372036854775805/testtab/"
":./9223372036854775806/testtab/"
":./0W/testtab/"
q)\l .
q)meta testtab
c | t f a
------| -----
int | j
expint| j
// int and expected int agree
q)testtab
int expint
---------------------------------------
0 0
1 1
32766 32766
32767 32767
32768 32768
2147483646 2147483646
2147483647 2147483647
2147483648 2147483648
9223372036854775805 9223372036854775805
9223372036854775806 9223372036854775806
// int and expected int still agree
q)select from testtab
int expint
---------------------------------------
0 0
1 1
32766 32766
32767 32767
32768 32768
2147483646 2147483646
2147483647 2147483647
2147483648 2147483648
9223372036854775805 9223372036854775805
9223372036854775806 9223372036854775806
0W 0W
// int and expected int DO NOT agree
q)select int, expint from testtab
int expint
-------------------------------
0 0
1 1
32766 32766
32767 32767
32768 32768
2147483646 2147483646
2147483647 2147483647
-2147483648 2147483648
-2147483648 9223372036854775805
-2147483648 9223372036854775806
-2147483648 0W
// Order of columns in query doesn't matter
q)select expint, int from testtab
expint int
-------------------------------
0 0
1 1
32766 32766
32767 32767
32768 32768
2147483646 2147483646
2147483647 2147483647
2147483648 -2147483648
9223372036854775805 -2147483648
9223372036854775806 -2147483648
0W -2147483648
// Not just a print error - the values are not the same
q)update match:int~'expint from select int, expint from testtab
int expint match
-------------------------------------
0 0 1
1 1 1
32766 32766 1
32767 32767 1
32768 32768 1
2147483646 2147483646 1
2147483647 2147483647 1
-2147483648 2147483648 0
-2147483648 9223372036854775805 0
-2147483648 9223372036854775806 0
-2147483648 0W 0
// Selecting just int is fine
q)select int from testtab
int
-------------------
0
1
32766
32767
32768
2147483646
2147483647
2147483648
9223372036854775805
9223372036854775806
0W
// Fine
q)select count expint by int from testtab
int | expint
-------------------| ------
0 | 1
1 | 1
32766 | 1
32767 | 1
32768 | 1
2147483646 | 1
2147483647 | 1
2147483648 | 1
9223372036854775805| 1
9223372036854775806| 1
0W | 1
// Also fine
q)select first expint by int from testtab
int | expint
-------------------| -------------------
0 | 0
1 | 1
32766 | 32766
32767 | 32767
32768 | 32768
2147483646 | 2147483646
2147483647 | 2147483647
2147483648 | 2147483648
9223372036854775805| 9223372036854775805
9223372036854775806| 9223372036854775806
0W | 0W
// Not fine
q)select first int by expint from testtab
expint | int
-------------------| -----------
0 | 0
1 | 1
32766 | 32766
32767 | 32767
32768 | 32768
2147483646 | 2147483646
2147483647 | 2147483647
2147483648 | -2147483648
9223372036854775805| -2147483648
9223372036854775806| -2147483648
0W | -2147483648
// int variable looks fine
q)int
0 1 32766 32767 32768 2147483646 2147483647 2147483648 9223372036854775805 9223372036854775806 0W
// Try similar experiment with int values
q)system "mkdir ../intdb"
q)system "cd ../intdb"
q)string {(` sv .Q.par[`:.;x;`testtab],`) set ([]expint:enlist x)} each 0 1 32766 32767 32768 2147483645 2147483646 2147483647i
":./0/testtab/"
":./1/testtab/"
":./32766/testtab/"
":./32767/testtab/"
":./32768/testtab/"
":./2147483645/testtab/"
":./2147483646/testtab/"
":./0W/testtab/"
q)\l .
q)meta testtab
c | t f a
------| -----
int | j
expint| i
// Problem still persists for max partition value
q)select from testtab
int expint
---------------------
0 0
1 1
32766 32766
32767 32767
32768 32768
2147483645 2147483645
2147483646 2147483646
0W 0W
q)select int, expint from testtab
int expint
----------------------
0 0
1 1
32766 32766
32767 32767
32768 32768
2147483645 2147483645
2147483646 2147483646
-2147483648 0W
// all int and expint agree if the max partition is removed
q)system "rm -r 0W"
q)\l .
q)select int, expint from testtab
int expint
---------------------
0 0
1 1
32766 32766
32767 32767
32768 32768
2147483645 2147483645
2147483646 2147483646
// Repeat the experiment with shorts
q)system "mkdir ../shortdb"
q)system "cd ../shortdb"
q)string {(` sv .Q.par[`:.;x;`testtab],`) set ([]expint:enlist x)} each 0 1 32765 32766 32767h
":./0/testtab/"
":./1/testtab/"
":./32765/testtab/"
":./32766/testtab/"
":./0W/testtab/"
q)\l .
q)meta testtab
c | t f a
------| -----
int | j
expint| h
q)select int, expint from testtab
int expint
------------------
0 0
1 1
32765 32765
32766 32766
-2147483648 0W
q)system "rm -r 0W"
q)\l .
q)select int, expint from testtab
int expint
------------
0 0
1 1
32765 32765
32766 32766
// Does deleting the 0W partition solve the issue in the original database?
q)system "cd ../longdb"
q)system "rm -r 0W"
q)\l .
// No
q)select int, expint from testtab
int expint
-------------------------------
0 0
1 1
32766 32766
32767 32767
32768 32768
2147483646 2147483646
2147483647 2147483647
-2147483648 2147483648
-2147483648 9223372036854775805
-2147483648 9223372036854775806 |
|
jonathonmcmurray
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Implements this request: partbyenum for ints
partbyenummode to partition by "raw" column values if the singlep#column insort.csvhas type short, int, or long. Values are clamped between0and2147483647(i.e.0Wi). Based on testing, a table with partition values above0Wican show some funny behaviour in itsintcolumn.maptointfunction to handle symbol and integer encoding and updated the IDBmaptointfunction with equivalent logic. Ideally the IDB would just get the same function definition from the WDB, but there are minor differences (WDB version uses on-disk sym file and must explicitly cast enumerated symbols to long). Testing shows no major performance change between old and new IDBmaptointfunctions.partbyenumcan also support an integer column.