Skip to content

Use different penalty scores for two ends of the DNA sequence #33

@misik

Description

@misik

Hi,

I'd like to ask for a feature request if it is feasible to do. I'd like to align a ~23 bp sequence to genome and find all similar sequences upto a certain mm/gap level. I already use glsearch36 for this type of search and parse the results of -m BB to look like:

CATTGCTAGACTTGACCCACNRG CATTGCTAGACTTGACCCACAGG ||||||||||||||||||||||| 0 0 0 0 0 0
CATTGCTAGACTTGACCCACNRG CATTGCCAGACTTGACCCACAA- ||||||.|||||||||||||||~ 1 1 0 1 0 1
CATTGCTAGACTTGACCCACNRG CATTGCCAGACTTGACCC-CGGG ||||||.|||||||||||~|||| 2 1 1 0 0 0

I'd like to have the option to have "NRG" or another selected sequence at the 3' or 5' end of the aligned query sequence to have a different penalty for mismatches and gaps. Currently I modified the mm penalty matrix and I use "NRK" and assign different penalties for R and K, but this may have some side effects if K is used for other sequences in the query. Is it possible to have two separate penalty matrices for the two parts of the sequence or any other way to implement this?

Thank you,
Meltem

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions