publication . Article . Other literature type . 2013

A primer to frequent itemset mining for bioinformatics

Wim Vanden Berghe; Pieter Meysman; Kris Laukens; Stefan Naulaerts; Wout Bittremieux; Trung Nghia Vu; Bart Goethals;
Open Access
  • Published: 26 Oct 2013 Journal: Briefings in Bioinformatics, volume 16, pages 216-231 (issn: 1467-5463, eissn: 1477-4054, Copyright policy)
  • Publisher: Oxford University Press (OUP)
  • Country: Belgium
Abstract
Abstract: Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other inter...
Subjects
ACM Computing Classification System: InformationSystems_DATABASEMANAGEMENTComputingMethodologies_PATTERNRECOGNITION
free text keywords: Molecular Biology, Information Systems, Papers, pattern mining, frequent item set, association rule, market basket analysis, Mathematics, Chemistry, Biology, Computer. Automation, Biclustering, Bioinformatics, Life Scientists, Association rule learning, Computer science, Data mining, computer.software_genre, computer, Affinity analysis, Data science, Shopping basket, Biological data
Related Organizations
102 references, page 1 of 7

Agrawal, R, Imieliński, T, Swami, A. Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, 1993. ; Vol. 22: 207-16 [OpenAIRE]

Carmona-Saez, P, Chagoyen, M, Rodriguez, A. Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics. 2006; 7: 54 [OpenAIRE] [PubMed]

Manda, P, Ozkan, S, Wang, H. Cross-ontology multi-level association rule mining in the gene ontology. PLoS One. 2012; 7: e47411 [OpenAIRE] [PubMed]

Koyutürk, M, Kim, Y, Subramaniam, S. Detecting conserved interaction patterns in biological networks. J Comput Biol. 2006; 13: 1299-322 [OpenAIRE] [PubMed]

Yoon, Y, Lee, GG. Subcellular localization prediction through boosting association rules. IEEE/ACM Trans Comput Biol Bioinform. 2012; 9: 609-18 [PubMed]

Agrawal, R, Srikant, R, Bocca, JB, Jarke, M, Zaniolo, C. Fast algorithms for mining association rules. Proceedings of the 20th VLDB Conference. 1994: 487-99

Goethals, B, Maimon, O, Rokach, L. Frequent Set Mining. The Data Mining and Knowledge Discovery Handbook. 2010: 321-38

Tan, P-N, Steinbach, M, Kumar, V. Chapter 6. Association analysis: basic concepts and algorithms. Introduction to Data Mining. 2005: 769

Antonie, ML, Zaïane, OR, Boulicaut, JF, Esposito, F, Giannotti, F, Pedreschi, D. Mining positive and negative association rules: an approach for confined rules. Knowledge Discovery in Databases: PKDD 2004. 2004: 27-38 [OpenAIRE]

Besson, J, Boulicaut, JF, Guns, T, Nijssen, S, Džeroski, S, Goethals, B, Panov, P. Generalizing itemset mining in a constraint programming setting. Inductive Databases and Constraint-Based Data Mining. 2010: 107-26

Tan, P-N, Kumar, V, Srivastava, J. Selecting the right interestingness measure for association patterns. Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, 2002. : 32-41

Franceschini, A, Szklarczyk, D, Frankild, S. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 2012; 41: D808-15 [OpenAIRE] [PubMed]

Zaki, M, Parthasarathy, S, Ogihara, M, Li, W. New algorithms for fast discovery of association rules. Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD). Newport Beach, CA, USA, 1997. : 283-6 [OpenAIRE]

Han, J, Pei, J, Yin, Y. Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov. 2004; 8: 53-87

Artamonova, II, Frishman, G, Gelfand, MS. Mining sequence annotation databanks for association patterns. Bioinformatics. 2005; 21: iii49-57 [OpenAIRE] [PubMed]

102 references, page 1 of 7
Abstract
Abstract: Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other inter...
Subjects
ACM Computing Classification System: InformationSystems_DATABASEMANAGEMENTComputingMethodologies_PATTERNRECOGNITION
free text keywords: Molecular Biology, Information Systems, Papers, pattern mining, frequent item set, association rule, market basket analysis, Mathematics, Chemistry, Biology, Computer. Automation, Biclustering, Bioinformatics, Life Scientists, Association rule learning, Computer science, Data mining, computer.software_genre, computer, Affinity analysis, Data science, Shopping basket, Biological data
Related Organizations
102 references, page 1 of 7

Agrawal, R, Imieliński, T, Swami, A. Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, 1993. ; Vol. 22: 207-16 [OpenAIRE]

Carmona-Saez, P, Chagoyen, M, Rodriguez, A. Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics. 2006; 7: 54 [OpenAIRE] [PubMed]

Manda, P, Ozkan, S, Wang, H. Cross-ontology multi-level association rule mining in the gene ontology. PLoS One. 2012; 7: e47411 [OpenAIRE] [PubMed]

Koyutürk, M, Kim, Y, Subramaniam, S. Detecting conserved interaction patterns in biological networks. J Comput Biol. 2006; 13: 1299-322 [OpenAIRE] [PubMed]

Yoon, Y, Lee, GG. Subcellular localization prediction through boosting association rules. IEEE/ACM Trans Comput Biol Bioinform. 2012; 9: 609-18 [PubMed]

Agrawal, R, Srikant, R, Bocca, JB, Jarke, M, Zaniolo, C. Fast algorithms for mining association rules. Proceedings of the 20th VLDB Conference. 1994: 487-99

Goethals, B, Maimon, O, Rokach, L. Frequent Set Mining. The Data Mining and Knowledge Discovery Handbook. 2010: 321-38

Tan, P-N, Steinbach, M, Kumar, V. Chapter 6. Association analysis: basic concepts and algorithms. Introduction to Data Mining. 2005: 769

Antonie, ML, Zaïane, OR, Boulicaut, JF, Esposito, F, Giannotti, F, Pedreschi, D. Mining positive and negative association rules: an approach for confined rules. Knowledge Discovery in Databases: PKDD 2004. 2004: 27-38 [OpenAIRE]

Besson, J, Boulicaut, JF, Guns, T, Nijssen, S, Džeroski, S, Goethals, B, Panov, P. Generalizing itemset mining in a constraint programming setting. Inductive Databases and Constraint-Based Data Mining. 2010: 107-26

Tan, P-N, Kumar, V, Srivastava, J. Selecting the right interestingness measure for association patterns. Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, 2002. : 32-41

Franceschini, A, Szklarczyk, D, Frankild, S. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 2012; 41: D808-15 [OpenAIRE] [PubMed]

Zaki, M, Parthasarathy, S, Ogihara, M, Li, W. New algorithms for fast discovery of association rules. Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD). Newport Beach, CA, USA, 1997. : 283-6 [OpenAIRE]

Han, J, Pei, J, Yin, Y. Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov. 2004; 8: 53-87

Artamonova, II, Frishman, G, Gelfand, MS. Mining sequence annotation databanks for association patterns. Bioinformatics. 2005; 21: iii49-57 [OpenAIRE] [PubMed]

102 references, page 1 of 7
Any information missing or wrong?Report an Issue