Skip to content

Assigned germline V gene not necessarily used as root in 3.2-run_IgPhyML.py #12

@ressy

Description

@ressy

The docstring for 3.2-run_IgPhyML.py says that -v is the "assigned germline V gene of known antibodes, for use in rooting the trees," but I'm running into some instance where it doesn't use this sequence ID for the root. I think this is because it figures out the sequence ID for the root of the tree based on a regular expression, and it can inadvertently pick up a different sequence depending on the full set of sequence IDs. The steps I see in 3.2-run_IgPhyML.py are:

  1. germ_seq is defined via -v argument
  2. germ_seq is written into the to-align file, along with the collected and native sequences
  3. germ_id defined by regex-matching each sequence ID from the aligned file
  4. germ_id is passed to igphyml as --root

In my case I have a "_LightSeq" suffix on each sequence in my natives.fa and re.search("(IG|VH|VK|VL|HV|KV|LV)", seq.id, re.I) matches the "ig" in each of those, overwriting the correct "IGLV..." sequence ID matched earlier in the file.

I can't override this by adding --root, either, since it's mutually-exclusive with -v. Would there be any downside to automatically setting arguments['--root'] = arguments['-v'] for the arguments['-v'] is not None case? Then that would get used as germ_id and passed to igphyml as the correct root.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions