publication . Article . Other literature type . 2018

Grasping frequent subgraph mining for bioinformatics applications

Aida Mrzic; Pieter Meysman; Wout Bittremieux; Pieter Moris; Boris Cule; Bart Goethals; Kris Laukens;
Open Access
  • Published: 01 Sep 2018
  • Publisher: Springer Nature America, Inc
  • Country: Belgium
Abstract
Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network structure that is deemed interesting within these data sets. The definition of which subgraphs are interesting and which are not is highly dependent on the application. These techniques have seen numerous applications and are able to tackle a range of biological research questions, spanning from the detection of common substructures in sets of biomolecular compounds, to the discovery of network motifs in large-scale molecular interaction networks. Thus far, infor...
Subjects
free text keywords: Review, Subgraph mining, Frequent subgraphs, Graph motifs, Biological networks, Pattern discovery, Pattern mining, Mathematics, Biology, Computer. Automation, Computational Theory and Mathematics, Genetics, Biochemistry, Molecular Biology, Computational Mathematics, Computer Science Applications, lcsh:Computer applications to medicine. Medical informatics, lcsh:R858-859.7, lcsh:Analysis, lcsh:QA299.6-433, Graph, Research questions, Computer science, Life Scientists, Bioinformatics, Biological network, Data set, Network structure
Related Organizations
Communities
Science and Innovation Policy Studies
Download fromView all 7 versions
ZENODO
Article . 2018
Provider: ZENODO
BioData Mining
Article . 2018
BioData Mining
Article . 2018
Provider: Crossref
88 references, page 1 of 6

Koyutürk, M, Grama, A, Szpankowski, W. An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics. 2004; 20 (suppl 1): 200-7 [OpenAIRE] [DOI]

Hu, H, Yan, X, Huang, Y, Han, J, Zhou, XJ. Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics. 2005; 21 (suppl 1): 213-21 [OpenAIRE] [DOI]

Cakmak, A, Ozsoyoglu, G. Mining biological networks for unknown pathways. Bioinformatics. 2007; 23 (20): 2775-83 [OpenAIRE] [PubMed] [DOI]

Meysman, P, Zhou, C, Cule, B, Goethals, B, Laukens, K. Mining the entire protein databank for frequent spatially cohesive amino acid patterns. BioData Min. 2015; 8 (1): 1 [OpenAIRE] [PubMed] [DOI]

Jiang, C, Coenen, F, Zito, M. A survey of frequent subgraph mining algorithms. Knowl Eng Rev. 2013; 28 (01): 75-105 [DOI]

Han, J, Cheng, H, Xin, D, Yan, X. Frequent pattern mining: current status and future directions. Data Min Knowl Disc. 2007; 15 (1): 55-86 [OpenAIRE] [DOI]

Washio, T, Motoda, H. State of the art of graph-based data mining. SIGKDD Explor Newsl. 2003; 5 (1): 59-68 [OpenAIRE] [DOI]

Fortin, S. The graph isomorphism problem. Technical report, Technical Report 96-20. 1996

9 Inokuchi A, Washio T, Motoda H. An apriori-based algorithm for mining frequent substructures from graph data. In: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery. PKDD ’00: 2000. p. 13–23.

10 Yan X, Han J. gspan: Graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining. ICDM ’02: 2002. p. 721.

11 Zaki MJ. Efficiently mining frequent trees in a forest. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’02: 2002. p. 71–80.

12 Asai T, Abe K, Kawasoe S, Arimura H, Sakamoto H, Arikawa S. Efficient substructure disco very from large semi-structured data. In: Proceedings of the 2002 SIAM International Conference on Data Mining: 2002. p. 158–74.

Pržulj, N, Corneil, DG, Jurisica, I. Efficient estimation of graphlet frequency distributions in protein–protein interaction networks. Bioinformatics. 2006; 22 (8): 974-80 [OpenAIRE] [PubMed] [DOI]

Pržulj, N. Biological network comparison using graphlet degree distribution. Bioinformatics. 2007; 23 (2): 177-83 [OpenAIRE] [PubMed] [DOI]

Hočevar, T, Demšar, J. A combinatorial approach to graphlet counting. Bioinformatics. 2013; 30 (4): 559-65 [OpenAIRE] [PubMed] [DOI]

88 references, page 1 of 6
Abstract
Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network structure that is deemed interesting within these data sets. The definition of which subgraphs are interesting and which are not is highly dependent on the application. These techniques have seen numerous applications and are able to tackle a range of biological research questions, spanning from the detection of common substructures in sets of biomolecular compounds, to the discovery of network motifs in large-scale molecular interaction networks. Thus far, infor...
Subjects
free text keywords: Review, Subgraph mining, Frequent subgraphs, Graph motifs, Biological networks, Pattern discovery, Pattern mining, Mathematics, Biology, Computer. Automation, Computational Theory and Mathematics, Genetics, Biochemistry, Molecular Biology, Computational Mathematics, Computer Science Applications, lcsh:Computer applications to medicine. Medical informatics, lcsh:R858-859.7, lcsh:Analysis, lcsh:QA299.6-433, Graph, Research questions, Computer science, Life Scientists, Bioinformatics, Biological network, Data set, Network structure
Related Organizations
Communities
Science and Innovation Policy Studies
Download fromView all 7 versions
ZENODO
Article . 2018
Provider: ZENODO
BioData Mining
Article . 2018
BioData Mining
Article . 2018
Provider: Crossref
88 references, page 1 of 6

Koyutürk, M, Grama, A, Szpankowski, W. An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics. 2004; 20 (suppl 1): 200-7 [OpenAIRE] [DOI]

Hu, H, Yan, X, Huang, Y, Han, J, Zhou, XJ. Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics. 2005; 21 (suppl 1): 213-21 [OpenAIRE] [DOI]

Cakmak, A, Ozsoyoglu, G. Mining biological networks for unknown pathways. Bioinformatics. 2007; 23 (20): 2775-83 [OpenAIRE] [PubMed] [DOI]

Meysman, P, Zhou, C, Cule, B, Goethals, B, Laukens, K. Mining the entire protein databank for frequent spatially cohesive amino acid patterns. BioData Min. 2015; 8 (1): 1 [OpenAIRE] [PubMed] [DOI]

Jiang, C, Coenen, F, Zito, M. A survey of frequent subgraph mining algorithms. Knowl Eng Rev. 2013; 28 (01): 75-105 [DOI]

Han, J, Cheng, H, Xin, D, Yan, X. Frequent pattern mining: current status and future directions. Data Min Knowl Disc. 2007; 15 (1): 55-86 [OpenAIRE] [DOI]

Washio, T, Motoda, H. State of the art of graph-based data mining. SIGKDD Explor Newsl. 2003; 5 (1): 59-68 [OpenAIRE] [DOI]

Fortin, S. The graph isomorphism problem. Technical report, Technical Report 96-20. 1996

9 Inokuchi A, Washio T, Motoda H. An apriori-based algorithm for mining frequent substructures from graph data. In: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery. PKDD ’00: 2000. p. 13–23.

10 Yan X, Han J. gspan: Graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining. ICDM ’02: 2002. p. 721.

11 Zaki MJ. Efficiently mining frequent trees in a forest. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’02: 2002. p. 71–80.

12 Asai T, Abe K, Kawasoe S, Arimura H, Sakamoto H, Arikawa S. Efficient substructure disco very from large semi-structured data. In: Proceedings of the 2002 SIAM International Conference on Data Mining: 2002. p. 158–74.

Pržulj, N, Corneil, DG, Jurisica, I. Efficient estimation of graphlet frequency distributions in protein–protein interaction networks. Bioinformatics. 2006; 22 (8): 974-80 [OpenAIRE] [PubMed] [DOI]

Pržulj, N. Biological network comparison using graphlet degree distribution. Bioinformatics. 2007; 23 (2): 177-83 [OpenAIRE] [PubMed] [DOI]

Hočevar, T, Demšar, J. A combinatorial approach to graphlet counting. Bioinformatics. 2013; 30 (4): 559-65 [OpenAIRE] [PubMed] [DOI]

88 references, page 1 of 6
Any information missing or wrong?Report an Issue