Skip to content

Mapping with no element change fails #61

@pbuslaev

Description

@pbuslaev

I was playing a bit with Lomap and noticed a strange behaviour when MCS is computed with element_change=False. When called with c1ccccc1 and c1ccncc1 (So one carbon in a ring changed to a nitrogen), Lomap can't find a mapping (see the code below for reproduction).

from rdkit import Chem
from rdkit.Chem import AllChem, rdMolAlign, rdFMCS

m1 = Chem.MolFromSmiles("c1ccccc1")
m2 = Chem.MolFromSmiles("c1ccncc1")
m1 = Chem.AddHs(m1)
m2 = Chem.AddHs(m2)
AllChem.EmbedMolecule(m1)
AllChem.EmbedMolecule(m2)

mcs = rdFMCS.FindMCS(
    [Chem.RemoveHs(m1), Chem.RemoveHs(m2)],
).smartsString
substr = Chem.MolFromSmarts(mcs)

m1_idx_list = Chem.RemoveHs(m1).GetSubstructMatch(substr)
m2_idx_list = Chem.RemoveHs(m2).GetSubstructMatch(substr)
heavy_atom_mapping = [(i, j) for i, j in zip(m1_idx_list, m2_idx_list)]
rdMolAlign.AlignMol(m1, m2, atomMap=heavy_atom_mapping)

print(rdMolAlign.CalcRMS(m1, m2, map=[heavy_atom_mapping]))

from lomap import MCS

mcs = MCS(m1, m2, threed=True, element_change=False)
print(list(mcs))

As far as I can tell, this is due to the logics in these lines. So, when element_change is expexted, the RdKit MCS search is called with completeRingsOnly=True and atomCompare=rdFMCS.AtomCompare.CompareElements flags. Since N != C, there are no rings, and no atoms matched. I am not sure that this is the expected behaviour. Maybe, a slightly different logics is worth considering:

If element_change == False, we run a standard MCS, but then we remove atoms with different mass from the mapping. Also some hydrogen mapping should be removed. If you think this is a potential way to go, I can work on the implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions