Skip to content

tiny-count: anchored selector changes, user experience improvements for overlap selectors#282

Merged
taimontgomery merged 17 commits intomasterfrom
issue-281
Feb 24, 2023
Merged

tiny-count: anchored selector changes, user experience improvements for overlap selectors#282
taimontgomery merged 17 commits intomasterfrom
issue-281

Conversation

@AlexTate
Copy link
Member

@AlexTate AlexTate commented Feb 19, 2023

Anchored overlap selectors

The semantics of 5' anchored, 3' anchored, and anchored overlap selectors have been changed to require nesting of the non-anchored end of the alignment within the feature's interval. This, along with the recently introduced overlap shift parameters, furthers our goal of adding support for the quantification of isomiRs

Invalid shifted interval warnings

The user is now notified if an invalid feature interval is produced as a result of an overlap shift parameter. These feature-rule pairs are omitted from selection so it's important for the user to be aware. The notice includes the specific feature IDs and their matched rule + selector definition, and these matches are organized into descriptive sections by violation type (null, inverted, or negative start).

User experience improvements

  • The "full" overlap selector has been renamed to "nested" to be more descriptive
  • The Overlap column of the Features Sheet now supports wildcard keywords (any, all, *, and an empty cell) to make it more consistent with the other columns. This is functionally equivalent to specifying "partial"

Documentation diagram changes

  • Overlap selector descriptions and figures have been updated and improved. Note that documentation updates for shift parameters and counting by sequence will be added in a followup PR once these features are finalized
  • An additional rule has been added to the selection diagram to hint at support for isomiR quantification
  • Text and line elements that were red have been changed to magenta as an initial step in supporting colorblind users. Text and line elements that were previously magenta are now light blue.

Closes #281

…anchored end of the alignment to be nested within the feature's interval
…d feature intervals at construction time. It isn't necessary to check for this.
…ng requirement for *anchored overlap. This commit also freshens overlap diagrams and their presentation to make them more digestible. Also changing red line elements to magenta as an initial step in supporting colorblind users
…the same light blue that is used in Stage 1 and stage panel borders. This is an initial move to support colorblind users.

An additional rule has also been added to the selection table to indicate support for isomiR quantification, which is a major use case for the interval shift parameters that were recently added to overlap selectors
… this one earlier)

Also adding a (very basic) backward compatibility check for the "full" overlap selector. This should be temporary. A backward compatibility class should be added for the Features Sheet, but in order to do it right, I'd also need to write a Features Sheet class where validation takes place. Need to maintain a consistent design pattern. I need to make a clear delineation between validation and backward compatibility, and that needs to be a separate GH issue.
…any', 'all', '*', empty cell) can now be used in the overlap column of the Features Sheet
…equire diagnostics-related arguments. We used to collect selection diags and store them in the LibStats object, but this was removed for performance reasons and because it had little relevance outside of the original scenario that necessitated it.
…inition when a shift parameter results in an IllegalShiftError
…ift parameters producing IllegalShiftErrors.

Also, Reference* classes will fall back to the HTSeq StepVector for any error encountered while attempting to import and patch the Cython StepVector. Previously this was only done for ModuleNotFoundErrors, which is too conservative
…on diagram to make it easier on the eyes with GitHub's dark theme. With white background these colors are slightly less washed out than they were before
…hanges. This is unrelated to the current issue but doesn't warrant an issue of its own
…ntering the images in the table. Also adding a note about why we chose these colors.
…ctor instance. The cache now reaches not only across features, but across GFF files too. Previously the cache was only relevant per-call to build_interval_selectors(), which was perfectly fine then, but no longer ideal with shift parameters in the mix.

Cache key has been corrected so that overlap selectors aren't cached with strand considered, which would have been inconsistent with the current design (strand is irrelevant to an overlap selector, so unnecessary duplicate instances would have been created).

An improved approach for wildcard overlap selectors has been added that's more compatible with the cache. An added benefit is the narrower try/except.
…nd to great benefit in memory usage... rejoice!
@AlexTate
Copy link
Member Author

Changes to overlap selector caching

Expanded cache scope. Previously the cache was only used per subinterval per feature. Now it reaches across features and GFF files too. The cache is used to reduce memory footprint by preventing duplicate instances from being created for the same interval and overlap selector type.

More memory efficient overlap selector instances

The Interval*Match classes only ever need to hold start and end coordinates. They don't need to support dynamic attribute creation, which Python classes support but at a cost. After removing this feature the memory footprint of overlap selectors was reduced by ~75% with my local test config. In worst case scenarios instances of these classes can be very, very numerous.

Aesthetic documentation changes

The color and transparency of larger shapes in the selection diagram has been slightly modified to make it easier on the eyes with GitHub's dark theme.

@taimontgomery
Copy link
Collaborator

images/tiny-count_selection.png

  1. Change Full to Nested
  2. Change miRNA rule to Overlap=5'Anchored, 0, 5
  3. Change isomiR rule to Hierarchy=3

…updating the Features Sheet table with rule changes per Tai
@AlexTate
Copy link
Member Author

AlexTate commented Feb 23, 2023

  1. Change Full to Nested

Thanks for catching that, missed that one.

  1. Change miRNA rule to Overlap=5'Anchored, 0, 5
  2. Change isomiR rule to Hierarchy=3

Ok, I've added the new ruleset to the diagram. The animation uses a dark background; the actual diagram is transparent.

Selection_Diagram_1 9 1

@taimontgomery
Copy link
Collaborator

Tested rigorously with ram1 data. Minimal testing with zswim and Lib303 data. All tests passed.

@taimontgomery taimontgomery merged commit d6e72f8 into master Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tiny-count: changes to anchored selector semantics, and user experience improvements for overlap selectors

2 participants