-
Notifications
You must be signed in to change notification settings - Fork 9
Description
The docstring for 3.2-run_IgPhyML.py says that -v is the "assigned germline V gene of known antibodes, for use in rooting the trees," but I'm running into some instance where it doesn't use this sequence ID for the root. I think this is because it figures out the sequence ID for the root of the tree based on a regular expression, and it can inadvertently pick up a different sequence depending on the full set of sequence IDs. The steps I see in 3.2-run_IgPhyML.py are:
- germ_seq is defined via -v argument
- germ_seq is written into the to-align file, along with the collected and native sequences
- germ_id defined by regex-matching each sequence ID from the aligned file
- germ_id is passed to igphyml as --root
In my case I have a "_LightSeq" suffix on each sequence in my natives.fa and re.search("(IG|VH|VK|VL|HV|KV|LV)", seq.id, re.I) matches the "ig" in each of those, overwriting the correct "IGLV..." sequence ID matched earlier in the file.
I can't override this by adding --root, either, since it's mutually-exclusive with -v. Would there be any downside to automatically setting arguments['--root'] = arguments['-v'] for the arguments['-v'] is not None case? Then that would get used as germ_id and passed to igphyml as the correct root.