As already mentioned in multiple issues and over email/slack, we need automated tests that will be able to track performance regression.
This issue is meant to define scope.
Related useful project is planned in conbench. Once it will be working, I think we should use it. Unfortunately it does not seem to happen anytime soon, or even in a more distant future.
Anyway, keeping scope minimal should make it easier to eventually move to conbench later on.
Another related work is my old project macrobenchmarking.
And recent draft PR #4517.
Scope
Dimensions by which we will track timings
- environment (allow to lookup hardware configuration)
- R version
- git sha of data.table (lookup date and version)
- benchmark script (probably fixed to
benchmark.Rraw)
- query
- version of a query (in case we modify existing query for some reason)
- description
Dimensions that for now I propose to not include in scope
Challenges
Store timings
In current infrastructre we do not have any processes that appends artifacts (timings in context of CB). Each CB run has to store results somewhere and re-use them later on.
Signalling a regression
Environment
To reduce number of false regression signals we need to use private dedicated infrastructure.
Having dedicated machine may not be feasible, so we need to have a mechanism of signalling to jenkins (or other orchestration process) that particular machine is in use in an exclusive mode.
Pipeline
In the most likely case of not having a dedicated machine, CB may ended up being queued for a longer while (up to multiple days). Therefore it make sense to have it in a separate pipeline rather than in our data.table GLCI. Such CB pipeline could be scheduled to run daily or weekly instead of running on each commit.
Versioning
Example test cases
As already mentioned in multiple issues and over email/slack, we need automated tests that will be able to track performance regression.
This issue is meant to define scope.
Related useful project is planned in conbench. Once it will be working, I think we should use it. Unfortunately it does not seem to happen anytime soon, or even in a more distant future.
Anyway, keeping scope minimal should make it easier to eventually move to conbench later on.
Another related work is my old project macrobenchmarking.
And recent draft PR #4517.
Scope
Dimensions by which we will track timings
benchmark.Rraw)Dimensions that for now I propose to not include in scope
datatable.optimizeoptionChallenges
Store timings
In current infrastructre we do not have any processes that appends artifacts (timings in context of CB). Each CB run has to store results somewhere and re-use them later on.
Signalling a regression
Environment
To reduce number of false regression signals we need to use private dedicated infrastructure.
Having dedicated machine may not be feasible, so we need to have a mechanism of signalling to jenkins (or other orchestration process) that particular machine is in use in an exclusive mode.
Pipeline
In the most likely case of not having a dedicated machine, CB may ended up being queued for a longer while (up to multiple days). Therefore it make sense to have it in a separate pipeline rather than in our data.table GLCI. Such CB pipeline could be scheduled to run daily or weekly instead of running on each commit.
Versioning
data.tableproject? or a separate projectinst/tests/benchmark.Rrawbenchmark()that meant to be used liketest(), andbenchmark.data.table()to be used liketest.data.table().ci/Example test cases
[[on a list column by group [[ by group takes forever (24 hours +) with v1.13.0 vs 4 seconds with v1.12.8 #4646DT[10L],DT[, 3L]Selecting from data.table by row is very slow #3735.SDfor many columns add timing test for many .SD cols #3797setDTin a loop setDT could be much simpler #4476DT[, uniqueN(a), by=b], should stress new throttle feature throttle threads for iterated small data tasks #4484