tiny-count: new scope of hierarchy value#229
Merged
taimontgomery merged 9 commits intomasterfrom Sep 20, 2022
Merged
Conversation
…age 3 selection by sorting Stage 2 matches by their hierarchy value. Once a feature-rule pair passes Stage 3 selection, it sets min_rank to which all subsequent matches are compared. The Stage 3 loop exits as soon as a feature-rule pair is found to differ from min_rank. Strand representation in tiny-count has been converted to boolean values: True for '+' and False for '-'. This can later be extended to include None for '.', but this currently isn't needed. The strand selector has been similarly updated. Strand matches are now determined by a chain of boolean XOR operations which are substantially more efficient.
# Conflicts: # tests/unit_tests_hts_parsing.py # tiny/rna/counter/hts_parsing.py # tiny/rna/counter/statistics.py
… been disambiguated to "larger" and "smaller"
Collaborator
|
Passed counts tests with ram1 dataset. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rather than eliminating candidates in Stage 2 by hierarchy value, we now use the value to sort Stage 2 matches so that they are evaluated by ascending order of hierarchy in Stage 3. These changes have resulted in a 4% average decrease in unassigned reads in our internal testing dataset.
This PR also contains a new approach for the Strand selector. Strand relationships are now evaluated as chained XOR operations rather than the prior approach of combined string and boolean equality checks.
Unit tests and documentation have been updated.
Closes #223