A Distributed Constrained Non-negative Matrix Factorization Algorithm for Time-Series Gene Expression Data

descriptionPublicationkeyboard_double_arrow_right Article 15 Aug 2018Publisher:ACMJournal:Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Authors: Matthew Dyer; Julian Dymacek;

doi: 10.1145/3233547.3233579

A Distributed Constrained Non-negative Matrix Factorization Algorithm for Time-Series Gene Expression Data

- Summary
- Metrics

Abstract

We present a new distributed computing algorithm, Parallel Pattern Discovery (PPD), for constrained Non-negative Matrix Factorization (NMF). Our implementation offers the ability to constrain a specific pattern for optimization of the data while minimizing reconstruction error. Parallel Pattern Discovery operates within a distributed environment using a message passing interface. Distribution of the PPD algorithm provides better scalability and allows operation in single- or multiple-system environments. The algorithm was tested on a set of time-series, dose-dependent mRNA gene expression data. Parallel Pattern Discovery was found to accurately identify patterns within the data and reconstruct the original matrices. Our NMF algorithm found a smaller reconstruction error when compared against standard NMF algorithms. Development focused on running PPD as part of a system which identifies significantly contributing genes. Parallel Pattern Discovery is first run to find patterns from biological data. It is followed by Gene Set Enrichment (GSE) which takes the pattern data and relates it back to genetic pathways.

Related Organizations

Longwood University
United States

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average