Conversation
|
I haven't gone through yet but looks like a comprehensive guide on using
|
|
Haven't look either yet but just on the png size, removing -g compiler flag saved 1MB recently (.so reduced from 1.5MB to 0.5MB) so the package size is now apx 4MB of 5MB limit. 12KB not an issue (1.2% of remaining). The |
…d warning about package depending on R 3.5+
Codecov Report
@@ Coverage Diff @@
## master #3572 +/- ##
=======================================
Coverage 97.58% 97.58%
=======================================
Files 66 66
Lines 12695 12695
=======================================
Hits 12389 12389
Misses 306 306Continue to review full report at Codecov.
|
1 similar comment
Codecov Report
@@ Coverage Diff @@
## master #3572 +/- ##
=======================================
Coverage 97.58% 97.58%
=======================================
Files 66 66
Lines 12695 12695
=======================================
Hits 12389 12389
Misses 306 306Continue to review full report at Codecov.
|
|
For future reference ... |
|
why not use csv.gz instead of RData? there is no risk that vignette can be build on newer R only due to format incompatibility? |
|
|
||
| This vignette will explain the most common ways to use the `.SD` variable in your `data.table` analyses. It is an adaptation of [this answer](https://stackoverflow.com/a/47406952/3576984) given on StackOverflow. | ||
|
|
||
| # What is `.SD`? |
There was a problem hiding this comment.
https://rdatatable.gitlab.io/data.table/library/data.table/doc/datatable-sd-usage.html
rendered from R it results into What is <code>.SD</code>? tab name in browser, maybe better remove code and leave .SD as plaintext
| Pitching[ , coef(lm(ERA ~ ., data = .SD))['W'], .SDcols = c('W', rhs)] | ||
| }) | ||
| barplot(lm_coef, names.arg = sapply(models, paste, collapse = '/'), | ||
| main = 'Wins Coefficient\nWiith Various Covariates', |
|
|
||
| ## Conditional Joins | ||
|
|
||
| `data.table` syntax is beautiful for its simplicity and robustness. The syntax `x[i]` flexibly handles two common approaches to subsetting -- when `i` is a `logical` vector, `x[i]` will return those rows of `x` corresponding to where `i` is `TRUE`; when `i` is _another `data.table`_, a (right) `join` is performed (in the plain form, using the `key`s of `x` and `i`, otherwise, when `on = ` is specified, using matches of those columns). |
There was a problem hiding this comment.
there is also a case of DT["someid"]
|
|
||
| Note that this approach can of course be combined with `.SDcols` to return only portions of the `data.table` for each `.SD` (with the caveat that `.SDcols` should be fixed across the various subsets) | ||
|
|
||
| _NB_: `.SD[1L]` is currently optimized by [_`GForce`_](https://jangorecki.gitlab.io/data.table/library/data.table/html/datatable-optimize.html) ([see also](https://stackoverflow.com/questions/22137591/about-gforce-in-data-table-1-9-2)), `data.table` internals which massively speed up the most common grouped operations like `sum` or `mean` -- see `?GForce` for more details and keep an eye on/voice support for feature improvement requests for updates on this front: [1](https://github.com/Rdatatable/data.table/issues/735), [2](https://github.com/Rdatatable/data.table/issues/2778), [3](https://github.com/Rdatatable/data.table/issues/523), [4](https://github.com/Rdatatable/data.table/issues/971), [5](https://github.com/Rdatatable/data.table/issues/1197), [6](https://github.com/Rdatatable/data.table/issues/1414) |
There was a problem hiding this comment.
use Rdatatable namespace instead of jangorecki: https://Rdatatable.gitlab.io/data.table/library/data.table/html/datatable-optimize.html
Closes #3412
Not sure if we should track the png on GH or not