Initial Cluster Analysis

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 01 Feb 2018 English Publisher:SAGE PublicationsJournal:Journal of Computational Biology, volume 25, pages 121-129 (eissn: 1557-8666,

Copyright policy )

Authors: Stephen F. Altschul; Andrew F. Neuwald;

doi: 10.1089/cmb.2017.0050

pmid: 28771374

pmc: PMC5806593

Initial Cluster Analysis

- Summary
- Subjects
- External Databases
  (1)
- Metrics

Abstract

We study a simple abstract problem motivated by a variety of applications in protein sequence analysis. Consider a string of 0s and 1s of length L, and containing D 1s. If we believe that some or all of the 1s may be clustered near the start of the sequence, which subset is the most significantly so clustered, and how significant is this clustering? We approach this question using the minimum description length principle and illustrate its application by analyzing residues that distinguish translational initiation and elongation factor guanosine triphosphatases (GTPases) from other P-loop GTPases. Within a structure of yeast elongation factor 1[Formula: see text], these residues form a significant cluster centered on a region implicated in guanine nucleotide exchange. Various biomedical questions may be cast as the abstract problem considered here.

Related Organizations

University of Maryland, Baltimore
United States
National Institutes of Health
United States
National Institute of Health
Pakistan
United States National Library of Medicine
United States
University of Maryland School of Medicine
United States

View all View all

Keywords

Saccharomyces cerevisiae Proteins, Sequence Analysis, Protein, GTP Phosphohydrolase-Linked Elongation Factors, Cluster Analysis, Computational Biology, Research Articles

1g7c

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	7
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

7

Top 10%

Average

Green

bronze

Fields of Science (4) View all

engineering and technology

medical engineering

Fields of Science

engineering and technology

medical engineering

View all