Skip to content

cdot gene biotype is a list #15

@davmlaw

Description

@davmlaw

Problem:

cdot JSON used to have gene biotype as a string, but it has changed to being a list

In create_biotype_regions_array this is returned:

    def get_biotype(gene):
        if gene.biotype in other_biotypes:
            return "other"
        elif gene.biotype == "misc_RNA":
            if gene.name and gene.name.startswith("RNY"):
                return "yRNA"
        return gene.biotype

Which returns a list, this eventually causes a crash of:

        length_counters[length][read_region] += 1

unhashable type list

Fix:

The difficulty here is that we are assuming 1 biotype, while we may have to pick

The reason we are passing gene into get_biotype() rather than just relying on biotype in the loop:

    for biotype in other_biotypes + interesting_biotypes:

Is because we have special case code handling yRNA

We can possible break this out to only handle the biotype

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions