Skip to content

Filtering insertions in homopolymer regions #16

@landesfeind

Description

@landesfeind

Hi,

utilizing VarDictJava v1.3 for calling variants from RNA-seq of tumor cells gives me in an unreasonable amount of called insertions (> 5500 on the whole genome). I had a closer look on some of the suspicious inserts and they seem to be called at positions where a homopolymer of 6 or more equal nucleotides start. At these positions, most reads align perfectly with the reference sequence. However, a few have an additional nucleotide (e.g., AF=7%).

Currently, I am using quite straight forward filter settings (looking at DP, HICNT, etc). Do you have any experience on how to filter these variants for a high-quality prediction? Does VarDict somehow try to fix this during realignment or report some value which allows to filter this (except removing all insertions in homopolymer regions).

Thanks in advance
Manuel

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions