Skip to content

tiny-config: SamplesSheet: validate group names to avoid namespace collisions in R#268

Merged
taimontgomery merged 3 commits intomasterfrom
issue-266
Dec 23, 2022
Merged

tiny-config: SamplesSheet: validate group names to avoid namespace collisions in R#268
taimontgomery merged 3 commits intomasterfrom
issue-266

Conversation

@AlexTate
Copy link
Member

@AlexTate AlexTate commented Dec 23, 2022

R has strict character requirements for "syntactically valid" names in a variety of contexts, including column names. Sample group names therefore must undergo a translation to a valid form before analysis in tiny-deseq.r. This translation creates an opportunity for different group names to end up with the same "safe name", which will lead to a crash. For example, the group names a-b and a+b will both translate to a.b.

This PR adds an additional validation step to the SamplesSheet class to proactively catch these namespace collisions at pipeline startup. It provides a helpful error message that lists all collisions and groups them by shared "safe name"

Closes #266

…pended dot). This brings the function up to spec to match R's make.names()
… translation to ensure that no two groups share the same name after translation
@taimontgomery
Copy link
Collaborator

Tested successfully on ram1 dataset.

@taimontgomery taimontgomery merged commit 68d598e into master Dec 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tiny-deseq.r: bugfix: "syntactically invalid" control condition names are mishandled

2 participants