publication . Other literature type . Article . Preprint . 2018

EBIC: an evolutionary-based parallel biclustering algorithm for pattern discovery.

Orzechowski, Patryk; Sipper, Moshe; Huang, Xiuzhen; Moore, Jason H.;
Open Access English
  • Published: 22 May 2018
  • Publisher: Oxford University Press
Abstract
In this paper a novel biclustering algorithm based on artificial intelligence (AI) is introduced. The method called EBIC aims to detect biologically meaningful, order-preserving patterns in complex data. The proposed algorithm is probably the first one capable of discovering with accuracy exceeding 50% multiple complex patterns in real gene expression datasets. It is also one of the very few biclustering methods designed for parallel environments with multiple graphics processing units (GPUs). We demonstrate that EBIC outperforms state-of-the-art biclustering methods, in terms of recovery and relevance, on both synthetic and genetic datasets. EBIC also yields re...
Subjects
ACM Computing Classification System: ComputingMethodologies_PATTERNRECOGNITION
free text keywords: Original Papers, Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval, Quantitative Biology - Genomics, 68, 92, I.5.2, I.2.11, I.5.3, J.3, Statistics and Probability, Computational Theory and Mathematics, Biochemistry, Molecular Biology, Computational Mathematics, Computer Science Applications
Funded by
NIH| CORE--Endocrine and Reproduction Disruption
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5P30ES013508-04
  • Funding stream: NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES
,
NIH| Biomedical Computing and Informatics Strategies for Precision Medicine
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01LM012601-02
  • Funding stream: NATIONAL LIBRARY OF MEDICINE
39 references, page 1 of 3

1. Wassim Ayadi, Ons Maaˆtouk, and Hend Bouziri. Evolutionary biclustering algorithm of gene expression data. In Database and Expert Systems Applications (DEXA), 2012 23rd International Workshop on, pages 206-210. IEEE, 2012. [OpenAIRE]

2. A. Ben-Dor, B. Chor, R. Karp, and Z. Yakhini. Discovering local structure in gene expression data: the order-preserving submatrix problem. J. Comput. Biol., 10(3-4):373-384, 2003.

3. Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), pages 289-300, 1995.

4. Sven Bergmann, Jan Ihmels, and Naama Barkai. Iterative signature algorithm for the analysis of large-scale gene expression data. Physical review E, 67(3):031902, 2003.

5. Anindya Bhattacharya and Yan Cui. A gpu-accelerated algorithm for biclustering analysis and detection of condition-dependent coexpression network modules. Scientific Reports, 7(1):4162, 2017.

6. Doruk Bozdag˘, Jeffrey D Parvin, and Umit V Catalyurek. A biclustering method to discover co-regulated genes using diverse gene expression datasets. In Bioinformatics and Computational Biology, pages 151-163. Springer, 2009.

7. Stanislav Busygin, Oleg Prokopyev, and Panos M Pardalos. Biclustering in data mining. Computers & Operations Research, 35(9):2964-2987, 2008.

8. Y. Cheng and G. M. Church. Biclustering of expression data. In Proceedings of the eighth international conference on intelligent systems for molecular biology, volume 8, pages 93-103, 2000.

9. Travers Ching, Daniel S Himmelstein, Brett K Beaulieu-Jones, Alexandr A Kalinin, Brian T Do, Gregory P Way, Enrico Ferrero, PaulMichael Agapow, Wei Xie, Gail L Rosen, et al. Opportunities and obstacles for deep learning in biology and medicine. bioRxiv, page 142760, 2017. [OpenAIRE]

10. Sean Davis and Paul S Meltzer. Geoquery: a bridge between the gene expression omnibus (geo) and bioconductor. Bioinformatics, 23(14):1846-1847, 2007.

11. Federico Divina and Jesus S Aguilar-Ruiz. Biclustering of expression data with evolutionary computation. IEEE transactions on knowledge and data engineering, 18(5):590-602, 2006.

12. Sara Dolnicar, Sebastian Kaiser, Katie Lazarevski, and Friedrich Leisch. Biclustering: Overcoming data dimensionality problems in market segmentation. Journal of Travel Research, 51(1):41-49, 2012.

13. Kemal Eren, Mehmet Deveci, Onur Ku¨c¸u¨ktunc¸, and U¨ mit V C¸atalyu¨rek. A comparative analysis of biclustering algorithms for gene expression data. Briefings in bioinformatics, 14(3):279-292, 2013.

14. Seth Falcon and Robert Gentleman. Using gostats to test gene lists for go term association. Bioinformatics, 23(2):257-258, 2007.

15. Fred Glover. Tabu searchpart i. ORSA Journal on computing, 1(3):190- 206, 1989.

39 references, page 1 of 3
Related research
Abstract
In this paper a novel biclustering algorithm based on artificial intelligence (AI) is introduced. The method called EBIC aims to detect biologically meaningful, order-preserving patterns in complex data. The proposed algorithm is probably the first one capable of discovering with accuracy exceeding 50% multiple complex patterns in real gene expression datasets. It is also one of the very few biclustering methods designed for parallel environments with multiple graphics processing units (GPUs). We demonstrate that EBIC outperforms state-of-the-art biclustering methods, in terms of recovery and relevance, on both synthetic and genetic datasets. EBIC also yields re...
Subjects
ACM Computing Classification System: ComputingMethodologies_PATTERNRECOGNITION
free text keywords: Original Papers, Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval, Quantitative Biology - Genomics, 68, 92, I.5.2, I.2.11, I.5.3, J.3, Statistics and Probability, Computational Theory and Mathematics, Biochemistry, Molecular Biology, Computational Mathematics, Computer Science Applications
Funded by
NIH| CORE--Endocrine and Reproduction Disruption
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5P30ES013508-04
  • Funding stream: NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES
,
NIH| Biomedical Computing and Informatics Strategies for Precision Medicine
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01LM012601-02
  • Funding stream: NATIONAL LIBRARY OF MEDICINE
39 references, page 1 of 3

1. Wassim Ayadi, Ons Maaˆtouk, and Hend Bouziri. Evolutionary biclustering algorithm of gene expression data. In Database and Expert Systems Applications (DEXA), 2012 23rd International Workshop on, pages 206-210. IEEE, 2012. [OpenAIRE]

2. A. Ben-Dor, B. Chor, R. Karp, and Z. Yakhini. Discovering local structure in gene expression data: the order-preserving submatrix problem. J. Comput. Biol., 10(3-4):373-384, 2003.

3. Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), pages 289-300, 1995.

4. Sven Bergmann, Jan Ihmels, and Naama Barkai. Iterative signature algorithm for the analysis of large-scale gene expression data. Physical review E, 67(3):031902, 2003.

5. Anindya Bhattacharya and Yan Cui. A gpu-accelerated algorithm for biclustering analysis and detection of condition-dependent coexpression network modules. Scientific Reports, 7(1):4162, 2017.

6. Doruk Bozdag˘, Jeffrey D Parvin, and Umit V Catalyurek. A biclustering method to discover co-regulated genes using diverse gene expression datasets. In Bioinformatics and Computational Biology, pages 151-163. Springer, 2009.

7. Stanislav Busygin, Oleg Prokopyev, and Panos M Pardalos. Biclustering in data mining. Computers & Operations Research, 35(9):2964-2987, 2008.

8. Y. Cheng and G. M. Church. Biclustering of expression data. In Proceedings of the eighth international conference on intelligent systems for molecular biology, volume 8, pages 93-103, 2000.

9. Travers Ching, Daniel S Himmelstein, Brett K Beaulieu-Jones, Alexandr A Kalinin, Brian T Do, Gregory P Way, Enrico Ferrero, PaulMichael Agapow, Wei Xie, Gail L Rosen, et al. Opportunities and obstacles for deep learning in biology and medicine. bioRxiv, page 142760, 2017. [OpenAIRE]

10. Sean Davis and Paul S Meltzer. Geoquery: a bridge between the gene expression omnibus (geo) and bioconductor. Bioinformatics, 23(14):1846-1847, 2007.

11. Federico Divina and Jesus S Aguilar-Ruiz. Biclustering of expression data with evolutionary computation. IEEE transactions on knowledge and data engineering, 18(5):590-602, 2006.

12. Sara Dolnicar, Sebastian Kaiser, Katie Lazarevski, and Friedrich Leisch. Biclustering: Overcoming data dimensionality problems in market segmentation. Journal of Travel Research, 51(1):41-49, 2012.

13. Kemal Eren, Mehmet Deveci, Onur Ku¨c¸u¨ktunc¸, and U¨ mit V C¸atalyu¨rek. A comparative analysis of biclustering algorithms for gene expression data. Briefings in bioinformatics, 14(3):279-292, 2013.

14. Seth Falcon and Robert Gentleman. Using gostats to test gene lists for go term association. Bioinformatics, 23(2):257-258, 2007.

15. Fred Glover. Tabu searchpart i. ORSA Journal on computing, 1(3):190- 206, 1989.

39 references, page 1 of 3
Related research
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue