tiny-count: anchored selector changes, user experience improvements for overlap selectors#282
tiny-count: anchored selector changes, user experience improvements for overlap selectors#282taimontgomery merged 17 commits intomasterfrom
Conversation
…anchored end of the alignment to be nested within the feature's interval
…d feature intervals at construction time. It isn't necessary to check for this.
…ng requirement for *anchored overlap. This commit also freshens overlap diagrams and their presentation to make them more digestible. Also changing red line elements to magenta as an initial step in supporting colorblind users
…the same light blue that is used in Stage 1 and stage panel borders. This is an initial move to support colorblind users. An additional rule has also been added to the selection table to indicate support for isomiR quantification, which is a major use case for the interval shift parameters that were recently added to overlap selectors
… this one earlier) Also adding a (very basic) backward compatibility check for the "full" overlap selector. This should be temporary. A backward compatibility class should be added for the Features Sheet, but in order to do it right, I'd also need to write a Features Sheet class where validation takes place. Need to maintain a consistent design pattern. I need to make a clear delineation between validation and backward compatibility, and that needs to be a separate GH issue.
…any', 'all', '*', empty cell) can now be used in the overlap column of the Features Sheet
…equire diagnostics-related arguments. We used to collect selection diags and store them in the LibStats object, but this was removed for performance reasons and because it had little relevance outside of the original scenario that necessitated it.
…y so that other classes can use it
…inition when a shift parameter results in an IllegalShiftError
…ift parameters producing IllegalShiftErrors. Also, Reference* classes will fall back to the HTSeq StepVector for any error encountered while attempting to import and patch the Cython StepVector. Previously this was only done for ModuleNotFoundErrors, which is too conservative
…on diagram to make it easier on the eyes with GitHub's dark theme. With white background these colors are slightly less washed out than they were before
…hanges. This is unrelated to the current issue but doesn't warrant an issue of its own
…ntering the images in the table. Also adding a note about why we chose these colors.
…ctor instance. The cache now reaches not only across features, but across GFF files too. Previously the cache was only relevant per-call to build_interval_selectors(), which was perfectly fine then, but no longer ideal with shift parameters in the mix. Cache key has been corrected so that overlap selectors aren't cached with strand considered, which would have been inconsistent with the current design (strand is irrelevant to an overlap selector, so unnecessary duplicate instances would have been created). An improved approach for wildcard overlap selectors has been added that's more compatible with the cache. An added benefit is the narrower try/except.
…nd to great benefit in memory usage... rejoice!
Changes to overlap selector cachingExpanded cache scope. Previously the cache was only used per subinterval per feature. Now it reaches across features and GFF files too. The cache is used to reduce memory footprint by preventing duplicate instances from being created for the same interval and overlap selector type. More memory efficient overlap selector instancesThe Interval*Match classes only ever need to hold start and end coordinates. They don't need to support dynamic attribute creation, which Python classes support but at a cost. After removing this feature the memory footprint of overlap selectors was reduced by ~75% with my local test config. In worst case scenarios instances of these classes can be very, very numerous. Aesthetic documentation changesThe color and transparency of larger shapes in the selection diagram has been slightly modified to make it easier on the eyes with GitHub's dark theme. |
|
images/tiny-count_selection.png
|
…updating the Features Sheet table with rule changes per Tai
|
Tested rigorously with ram1 data. Minimal testing with zswim and Lib303 data. All tests passed. |

Anchored overlap selectors
The semantics of
5' anchored,3' anchored, andanchoredoverlap selectors have been changed to require nesting of the non-anchored end of the alignment within the feature's interval. This, along with the recently introduced overlap shift parameters, furthers our goal of adding support for the quantification of isomiRsInvalid shifted interval warnings
The user is now notified if an invalid feature interval is produced as a result of an overlap shift parameter. These feature-rule pairs are omitted from selection so it's important for the user to be aware. The notice includes the specific feature IDs and their matched rule + selector definition, and these matches are organized into descriptive sections by violation type (null, inverted, or negative start).
User experience improvements
any,all,*, and an empty cell) to make it more consistent with the other columns. This is functionally equivalent to specifying "partial"Documentation diagram changes
Closes #281