
doi: 10.1007/bf02703680
pmid: 11927774
We have analysed the genomes of representatives of three kingdoms of life, namely, archaea, eubacteria and eukaryota using data mining tools based on compositional analyses of the protein sequences. The representatives chosen in this analysis were Methanococcus jannaschii, Haemophilus influenzae and Saccharomyces cerevisiae. We have identified the common and different features between the three genomes in the protein evolution patterns. M. jannaschii has been seen to have a greater number of proteins with more charged amino acids whereas S. cerevisiae has been observed to have a greater number of hydrophilic proteins. Despite the differences in intrinsic compositional characteristics between the proteins from the different genomes we have also identified certain common characteristics. We have carried out exploratory Principal Component Analysis of the multivariate data on the proteins of each organism in an effort to classify the proteins into clusters. Interestingly, we found that most of the proteins in each organism cluster closely together, but there are a few 'outliers'. We focus on the outliers for the functional investigations, which may aid in revealing any unique features of the biology of the respective organisms
Saccharomyces cerevisiae Proteins, Archaeal Proteins, Methanococcus, Computational Biology, Genomics, Saccharomyces cerevisiae, Sequence Analysis, DNA, Haemophilus influenzae, Bacterial Proteins, Genome, Archaeal, Humans, Genome, Fungal, Genome, Bacterial
Saccharomyces cerevisiae Proteins, Archaeal Proteins, Methanococcus, Computational Biology, Genomics, Saccharomyces cerevisiae, Sequence Analysis, DNA, Haemophilus influenzae, Bacterial Proteins, Genome, Archaeal, Humans, Genome, Fungal, Genome, Bacterial
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
