-
Notifications
You must be signed in to change notification settings - Fork 16
Description
I was playing a bit with Lomap and noticed a strange behaviour when MCS is computed with element_change=False. When called with c1ccccc1 and c1ccncc1 (So one carbon in a ring changed to a nitrogen), Lomap can't find a mapping (see the code below for reproduction).
from rdkit import Chem
from rdkit.Chem import AllChem, rdMolAlign, rdFMCS
m1 = Chem.MolFromSmiles("c1ccccc1")
m2 = Chem.MolFromSmiles("c1ccncc1")
m1 = Chem.AddHs(m1)
m2 = Chem.AddHs(m2)
AllChem.EmbedMolecule(m1)
AllChem.EmbedMolecule(m2)
mcs = rdFMCS.FindMCS(
[Chem.RemoveHs(m1), Chem.RemoveHs(m2)],
).smartsString
substr = Chem.MolFromSmarts(mcs)
m1_idx_list = Chem.RemoveHs(m1).GetSubstructMatch(substr)
m2_idx_list = Chem.RemoveHs(m2).GetSubstructMatch(substr)
heavy_atom_mapping = [(i, j) for i, j in zip(m1_idx_list, m2_idx_list)]
rdMolAlign.AlignMol(m1, m2, atomMap=heavy_atom_mapping)
print(rdMolAlign.CalcRMS(m1, m2, map=[heavy_atom_mapping]))
from lomap import MCS
mcs = MCS(m1, m2, threed=True, element_change=False)
print(list(mcs))As far as I can tell, this is due to the logics in these lines. So, when element_change is expexted, the RdKit MCS search is called with completeRingsOnly=True and atomCompare=rdFMCS.AtomCompare.CompareElements flags. Since N != C, there are no rings, and no atoms matched. I am not sure that this is the expected behaviour. Maybe, a slightly different logics is worth considering:
If element_change == False, we run a standard MCS, but then we remove atoms with different mass from the mapping. Also some hydrogen mapping should be removed. If you think this is a potential way to go, I can work on the implementation.