publication . Article . 2019

Scalable biclustering - the future of big data exploration?

Patryk Orzechowski; Krzysztof Boryczko; Jason H Moore;
Open Access English
  • Published: 01 Jun 2019 Journal: GigaScience, volume 8, issue 7 (eissn: 2047-217X, Copyright policy)
  • Publisher: Oxford University Press
Abstract
<jats:title>Abstract</jats:title> <jats:p>Biclustering is a technique of discovering local similarities within data. For many years the complexity of the methods and parallelization issues limited its application to big data problems. With the development of novel scalable methods, biclustering has finally started to close this gap. In this paper we discuss the caveats of biclustering and present its current challenges and guidelines for practitioners. We also try to explain why biclustering may soon become one of the standards for big data analytics.</jats:p>
Subjects
free text keywords: Commentary, biclustering, co-clustering, data mining, big data, parallel algorithms, disease subtype identification, biomarker detection, gene-drug interaction, precision medicine
Funded by
NIH| Biomedical Computing and Informatics Strategies for Precision Medicine
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01LM012601-02
  • Funding stream: NATIONAL LIBRARY OF MEDICINE

1.Kasim A, Shkedy Z, Kaiser S, et al.Applied Biclustering Methods for Big and High-Dimensional Data Using R. CRC Press; 2016.

2.Yoon S, Nguyen HCT, Jo W, et al.Biclustering analysis of transcriptome big data identifies condition-specific microRNA targets. Nucleic Acids Res. 2019;47(9):e53.30820547 [OpenAIRE] [PubMed]

3.Orzechowski P, Sipper M, Huang X, Moore JH EBIC: an evolutionary-based parallel biclustering algorithm for pattern discovery. Bioinformatics. 2018;34(21):3719–26.29790909 [OpenAIRE] [PubMed]

4.Orzechowski P, Moore JH EBIC: an open source software for high-dimensional and big data analyses. Bioinformatics. 2019, doi:10.1093/bioinformatics/btz027. [OpenAIRE] [DOI]

5.Gomez-Vela F, López A, Lagares JA, et al.Bioinformatics from a big data perspective: meeting the challenge. In: Rojas I, Ortuño F, eds. 5th International Work-Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2017, Granada, Spain, 2017. Springer; 2017:349–59.

6.Xie J, Ma A, Fennell A, et al.It is time to apply biclustering: a comprehensive review of biclustering applications in biological a nd biomedical data. Brief Bioinform. 2018, doi:10.1093/bib/bby014. [OpenAIRE] [DOI]

7.Madeira SC, Oliveira AL Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform. 2004;1(1):24–45.17048406 [OpenAIRE] [PubMed]

8.Padilha VA, Campello RJ A systematic comparative evaluation of biclustering techniques. BMC Bioinform. 2017;18(1):55. [OpenAIRE]

9.Horta D, Campello RJ Similarity measures for comparing biclusterings. IEEE/ACM Trans Comput Biol Bioinform. 2014;11(5):942–54.26356865 [OpenAIRE] [PubMed]

10.Patrikainen A, Meila M Comparing subspace clusterings. IEEE Trans Knowl Data Eng. 2006;18(7):902–16. [OpenAIRE]

Related research
Abstract
<jats:title>Abstract</jats:title> <jats:p>Biclustering is a technique of discovering local similarities within data. For many years the complexity of the methods and parallelization issues limited its application to big data problems. With the development of novel scalable methods, biclustering has finally started to close this gap. In this paper we discuss the caveats of biclustering and present its current challenges and guidelines for practitioners. We also try to explain why biclustering may soon become one of the standards for big data analytics.</jats:p>
Subjects
free text keywords: Commentary, biclustering, co-clustering, data mining, big data, parallel algorithms, disease subtype identification, biomarker detection, gene-drug interaction, precision medicine
Funded by
NIH| Biomedical Computing and Informatics Strategies for Precision Medicine
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01LM012601-02
  • Funding stream: NATIONAL LIBRARY OF MEDICINE

1.Kasim A, Shkedy Z, Kaiser S, et al.Applied Biclustering Methods for Big and High-Dimensional Data Using R. CRC Press; 2016.

2.Yoon S, Nguyen HCT, Jo W, et al.Biclustering analysis of transcriptome big data identifies condition-specific microRNA targets. Nucleic Acids Res. 2019;47(9):e53.30820547 [OpenAIRE] [PubMed]

3.Orzechowski P, Sipper M, Huang X, Moore JH EBIC: an evolutionary-based parallel biclustering algorithm for pattern discovery. Bioinformatics. 2018;34(21):3719–26.29790909 [OpenAIRE] [PubMed]

4.Orzechowski P, Moore JH EBIC: an open source software for high-dimensional and big data analyses. Bioinformatics. 2019, doi:10.1093/bioinformatics/btz027. [OpenAIRE] [DOI]

5.Gomez-Vela F, López A, Lagares JA, et al.Bioinformatics from a big data perspective: meeting the challenge. In: Rojas I, Ortuño F, eds. 5th International Work-Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2017, Granada, Spain, 2017. Springer; 2017:349–59.

6.Xie J, Ma A, Fennell A, et al.It is time to apply biclustering: a comprehensive review of biclustering applications in biological a nd biomedical data. Brief Bioinform. 2018, doi:10.1093/bib/bby014. [OpenAIRE] [DOI]

7.Madeira SC, Oliveira AL Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform. 2004;1(1):24–45.17048406 [OpenAIRE] [PubMed]

8.Padilha VA, Campello RJ A systematic comparative evaluation of biclustering techniques. BMC Bioinform. 2017;18(1):55. [OpenAIRE]

9.Horta D, Campello RJ Similarity measures for comparing biclusterings. IEEE/ACM Trans Comput Biol Bioinform. 2014;11(5):942–54.26356865 [OpenAIRE] [PubMed]

10.Patrikainen A, Meila M Comparing subspace clusterings. IEEE Trans Knowl Data Eng. 2006;18(7):902–16. [OpenAIRE]

Related research
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue