Skip to content

tiny-count: new scope of hierarchy value#229

Merged
taimontgomery merged 9 commits intomasterfrom
issue-223
Sep 20, 2022
Merged

tiny-count: new scope of hierarchy value#229
taimontgomery merged 9 commits intomasterfrom
issue-223

Conversation

@AlexTate
Copy link
Member

Rather than eliminating candidates in Stage 2 by hierarchy value, we now use the value to sort Stage 2 matches so that they are evaluated by ascending order of hierarchy in Stage 3. These changes have resulted in a 4% average decrease in unassigned reads in our internal testing dataset.

This PR also contains a new approach for the Strand selector. Strand relationships are now evaluated as chained XOR operations rather than the prior approach of combined string and boolean equality checks.

Unit tests and documentation have been updated.
Closes #223

…age 3 selection by sorting Stage 2 matches by their hierarchy value. Once a feature-rule pair passes Stage 3 selection, it sets min_rank to which all subsequent matches are compared. The Stage 3 loop exits as soon as a feature-rule pair is found to differ from min_rank.

Strand representation in tiny-count has been converted to boolean values: True for '+' and False for '-'. This can later be extended to include None for '.', but this currently isn't needed.

The strand selector has been similarly updated. Strand matches are now determined by a chain of boolean XOR operations which are substantially more efficient.
# Conflicts:
#	tests/unit_tests_hts_parsing.py
#	tiny/rna/counter/hts_parsing.py
#	tiny/rna/counter/statistics.py
@taimontgomery
Copy link
Collaborator

Passed counts tests with ram1 dataset.

@taimontgomery taimontgomery merged commit d039e6a into master Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tiny-count: change selection semantics to broaden scope of hierarchy values

2 participants