TLDR
I want to avoid column names like Z-normalised temperature (from DB1 before 2018 & from DB2 afterwards), without relying on code comments and in a more native way (e.g. comments can't be saved in a DT.rds file on disk). Hence, request to introduce a metadata function that could retrieve metadata which could have been optionally supplied by the user.
Problem
Pre-modelling, a large amount of time is spent collating and joining data from various sources, transforming some columns to make them just right for the model. Thus, a column may have been transformed multiple number of times, with various edge cases dealt with on a case by case basis.
Current solution
Use descriptive column names or comments. This can easily get unwieldy with column names like Z-normalised temperature (from DB1 before 2018 & from DB2 afterwards). Comments on the other hand, can't be shared as a part of the data.table.
Proposed solution
Introduce a metadata function that contains info stored by the user that optionally describes each column. This would probably also involve a setmetadata function or metadata<- function? Not sure.
TLDR
I want to avoid column names like
Z-normalised temperature (from DB1 before 2018 & from DB2 afterwards), without relying on code comments and in a more native way (e.g. comments can't be saved in aDT.rdsfile on disk). Hence, request to introduce ametadatafunction that could retrieve metadata which could have been optionally supplied by the user.Problem
Pre-modelling, a large amount of time is spent collating and joining data from various sources, transforming some columns to make them just right for the model. Thus, a column may have been transformed multiple number of times, with various edge cases dealt with on a case by case basis.
Current solution
Use descriptive column names or comments. This can easily get unwieldy with column names like
Z-normalised temperature (from DB1 before 2018 & from DB2 afterwards). Comments on the other hand, can't be shared as a part of the data.table.Proposed solution
Introduce a
metadatafunction that contains info stored by the user that optionally describes each column. This would probably also involve asetmetadatafunction ormetadata<-function? Not sure.