diff --git a/vignettes/datatable-importing.Rmd b/vignettes/datatable-importing.Rmd index c1236d1239..16a3cb39d2 100644 --- a/vignettes/datatable-importing.Rmd +++ b/vignettes/datatable-importing.Rmd @@ -122,9 +122,15 @@ The case for `:=` is slightly different, because `:=` is interpreted as a functi If you don't mind having `id` and `grp` registered as variables globally in your package namespace you can use `?globalVariables`. Be aware that these notes do not have any impact on the code or its functionality; if you are not going to publish your package, you may simply choose to ignore them. +## Care needed when providing and using options + +Common practice by R packages is to provide customization options set by `options(name=val)` and fetched using `getOption("name", default)`. Function arguments often specify a call to `getOption()` so that the user knows (from `?fun` or `args(fun)`) the name of the option controlling the default for that parameter; e.g. `fun(..., verbose=getOption("datatable.verbose", FALSE))`. All `data.table` options start with `datatable.` so as to not conflict with options in other packages. A user simply calls `options(datatable.verbose=TRUE)` to turn on verbosity. This affects all calls to `fun()` other the ones which have been provided `verbose=` explicity; e.g. `fun(..., verbose=FALSE)`. + +The option mechanism in R is _global_. Meaning that if a user sets a `data.table` option for their own use, that setting also affects code inside any package that is using `data.table` too. For an option like `datatable.verbose`, this is exactly the desired behavior since the desire is to trace and log all `data.table` operations from wherever they originate; turning on verbosity does not affect the results. Another unique-to-R and excellent-for-production option is R's `options(warn=2)` which turns all warnings into errors. Again, the desire is to affect any warning in any package so as to not missing any warnings in production. There are 6 `datatable.print.*` options and 3 optimization options which do not affect the result of operations, either. However, there is one `data.table` option that does and is now a concern: `datatable.nomatch`. This option changes the default join from outer to inner. [Aside, the default join is outer because outer is safer; it doesn't drop missing data silently.] Some users prefer inner join to be the default and we provided this option for them. However, a user setting this option can unintentionally change the behavior of joins inside packages that use `data.table`. Accordingly, in v1.12.4, we have started the process to deprecate the `datatable.nomatch` option. It is the only `data.table` option with this concern. + ## Troubleshooting -If you face any problems in this process, before trying to ask questions or reporting issues, please confirm that the problem is reproducible in a clean R session using the R console: `R CMD check package.name`. +If you face any problems in creating a package that uses data.table, please confirm that the problem is reproducible in a clean R session using the R console: `R CMD check package.name`. Some of the most common issues developers are facing are usually related to helper tools that are meant to automate some package development tasks, for example, using `roxygen` to generate your `NAMESPACE` file from metadata in the R code files. Others are related to helpers that build and check the package. Unfortunately, these helpers sometimes have unintended/hidden side effects which can obscure the source of your troubles. As such, be sure to double check using R console (run R on the command line) and ensure the import is defined in the `DESCRIPTION` and `NAMESPACE` files following the [instructions](#DESCRIPTION) [above](#NAMESPACE).