Maximum Common Substructure-based Data Fusion in Similarity Searching

Article English OPEN
Willett, P. (2015)
  • Publisher: American Chemical Society

Data fusion has been shown to work very well when applied to fingerprint-based similarity searching, yet little is known of its application to Maximum Common Substructure (MCS)-based similarity searching. Two similarity search applications of the MCS will be focussed on here. Typically, the number of bonds in the MCS, as well as the bonds in the two molecules being compared, are used in a similarity coefficient. The power of this technique can be extended using data fusion, where the MCS similarities of a set of reference molecules against one database molecule are fused. This “group fusion” technique forms the first application of the MCS in this work. The other application is that of the chemical hyperstructure. The hyperstructure concept is an alternative form of data fusion, being a hypothetical molecule that is constructed from the overlap of a set of existing molecules. This paper compares fingerprint group fusion (extended-connectivity fingerprints), MCS similarity group fusion, and hyperstructure similarity searching, and describes their relative merits and complementarity in virtual screening. It is concluded that the hyperstructure approach as implemented here is less generally effective than conventional fingerprint approaches.
  • References (32)
    32 references, page 1 of 4

    (1) Maggiora, G.; Vogt, M.; Stumpfe, D.; Bajorath, J. Molecular Similarity in Medicinal Chemistry. J. Med. Chem. 2014, 57, 3186-3204.

    (2) Willett, P. Similarity Methods in Chemoinformatics. Annu. Rev. Inform. Sci 2009, 43, 1-117.

    (3) Willett, P. Similarity-based Virtual Screening Using 2D Fingerprints. Drug Discov.Today 2006, 11, 1046-1053.

    (4) Todeschini, R.; Consonni, V. Molecular Descriptors for Chemoinformatics, 2nd ed.; WileyVCH: Weinheim, 2009.

    (5) Willett, P. Combination of Similarity Rankings Using Data Fusion. J. Chem. Inf. Model. 2013, 53, 1-10.

    (6) Hert, J.; Willett, P.; Wilton, D. J.; Acklin, P.; Azzaoui, K.; Jacoby, E.; Schuffenhauer, A. Comparison of Topological Descriptors for Similarity-based Virtual Screening Using Multiple Bioactive Reference Structures. Org. Biomol. Chem. 2004, 2, 3256.

    (7) Shemetulskis, N. E.; Weininger, D.; Blankley, C. J.; Yang, J. J.; Humblet, C. Stigmata: An Algorithm To Determine Structural Commonalities in Diverse Datasets. J. Chem. Inf. Comput. Sci. 1996, 36, 862-871.

    (8) Bunke, H.; Jiang, X.; Kandel, A. On the Minimum Common Supergraph of Two Graphs. Computing 2000, 65, 13-25.

    (9) Dubois, J. E.; Laurent, D.; Aranda, A. Méthode de Perturbation D'Environnements Limites Concentriques Ordonnés (PELCO). J. Chim. Phys. 1973, 70, 1608-1615.

    (10) Menon, G. K.; Cammarata, A. Pattern Recognition II: Investigation of Structure-activity Relationships. J. Pharm. Sci. 1977, 66, 304-314.

  • Related Research Results (2)
  • Similar Research Results (2)
  • Metrics
    views in OpenAIRE
    views in local repository
    downloads in local repository

    The information is available from the following content providers:

    From Number Of Views Number Of Downloads
    White Rose Research Online - IRUS-UK 0 71
Share - Bookmark