﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc	blockedby	blocking	notify_on_close	platform	project
3407	Maximum common substructure	Tristan Croll	Tristan Croll	"{{{
The following bug report has been submitted:
Platform:        Linux-3.10.0-1062.9.1.el7.x86_64-x86_64-with-centos-7.7.1908-Core
ChimeraX Version: 1.0 (2020-06-04 23:15:07 UTC)
Description
There are a few places where it would be really nice to have a fast algorithm available to find the maximum common substructure between two residues:

- suggesting MD templates to match a given residue in a residue- and atom-name agnostic manner (and then adding missing atoms and removing superfluous ones)
- ""mutating"" one residue to another while maintaining the maximum possible number of existing atom positions (e.g. swapping lipid head-groups, or switching between related drug-like compounds)

The challenge is that this is a hard, NP-complete problem. The algorithm available in `networkx.isomorphism.ISMAGS.largest_common_subgraph()` seems fast when matching by element as long as the second graph given in its __init__ is already a subgraph of the first - but as soon as the second has even a few superfluous atoms it slows to a crawl - e.g. taking many minutes to match phosphatidylcholine to phosphatidylethanolamine (I don't know exactly how long - I gave up waiting after about 10 minutes).

RDKit has a fast algorithm implemented (https://www.rdkit.org/docs/GettingStartedInPython.html#maximum-common-substructure; https://github.com/rdkit/rdkit/tree/master/Code/GraphMol/FMCS) and is BSD-licensed with releases for all platforms on Conda... although not for Python 3.8 yet. 

Any thoughts?

OpenGL version: 3.3.0 NVIDIA 440.33.01
OpenGL renderer: TITAN Xp/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation
Manufacturer: Dell Inc.
Model: Precision T5600
OS: CentOS Linux 7 Core
Architecture: 64bit ELF
CPU: 32 Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
Cache Size: 20480 KB
Memory:
	              total        used        free      shared  buff/cache   available
	Mem:            62G        5.6G         43G        374M         13G         56G
	Swap:          4.9G          0B        4.9G

Graphics:
	03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [TITAN Xp] [10de:1b02] (rev a1)	
	Subsystem: NVIDIA Corporation Device [10de:11df]	
	Kernel driver in use: nvidia
PyQt version: 5.12.3
Compiled Qt version: 5.12.4
Runtime Qt version: 5.12.8

}}}
"	enhancement	closed	normal		Structure Comparison		worksforme		Eric Pettersen Goddard				all	ChimeraX
