Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Universidade do Minh...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
addClaim

PhageAnnotate viral genome classifier

Authors: Duarte, António Nuno Carrilho Canatário;

PhageAnnotate viral genome classifier

Abstract

The advent of bacterial strains resistant to virtually all currently available antibiotic agents is indubitably alarming. Hence, alternative strategies for efficaciously combating such multi-drug resistant bacterial strains ought to be considered. Bacteriophages (phages), the most abundant biological entities on Earth, may play a key role in such a therapeutic revolution. Phage therapy exploits the evolutionary context of phages and bacteria: millions of years of host-parasite coevolution made phages lethal bacterial predators. Nonetheless, the process of “recruiting” phages that competently combat bacterial infections is pro foundly dependent on the thorough understanding of their biology and safety. Yet, under the tremendous diversity of phage’s genomes, most of their genes cannot be assigned to functions via homology-based techniques, significantly hindering such fundamental insights. PhageAnnotate, a phage annotation system powered by machine learning (ML), constitutes an attempt to transcend these obstacles. In order to confidently assemble a robust computational tool, several model architectures were put to test, namely Naïve, Linear, Hierarchical and Hierarchical-X. Such models, differing both at organizational and structural levels, were trained on a collection of 368.436 phage DNA sequences, and carry out a quite straightforward task: given a phage DNA sequence, assign it to a label representing a functional role. Naïve and Linear encompass a single gradient boosting (GB) model, solely differing in the organization and strictness of the labels concerning functional roles. Hierarchical adds a layer of complexity to the prob lem at hand: prior to assigning functional roles to DNA sequences, it crudely classifies them in one of six umbrella functional classes (i.e., DNA-modification, DNA-replication, lysis, lysogeny-repressor, packaging and structural); only then, and depending on the ascertained functional class, a more fine-grained func tional role labeling is performed. Such structuring translates in the construction of seven ML models: one discerning functional classes and six distinguishing functional roles. Hierarchical-X behaves identically to the latter, and stems from unsurely enlarging the sequence database utilized for training the ML models. A cautious evaluation of these model architectures dictated that PhageAnnotate ought to be embodied by Hierarchical. PhageAnnotate’s predictions are, as a result, guided by a GB model discerning functional classes, three GB models distinguishing functional roles pertaining the functional classes DNA modification, DNA-replication and structural, and three support vector machine (SVM) models discerning functional roles concerning the functional classes lysis, lysogeny-repressor and packaging. The F1 scores attained by each of these models, constituting proxy measures for their competency, were 87.57%, 82.17%, 83.38%, 84.77%, 97.30%, 83.72% and 98.14%, respectively. A thorough assessment of PhageAnnotate, and subsequent juxtaposition with current, well-established phage annotation tools revealed the indisputable usefulness of the system, being able to produce functional annotations that clearly stand out relative to those of its direct competitors.

Country
Portugal
Related Organizations
Keywords

Anotação de genomas, Machine learning, Bacteriófagos, Bacteriophages, Genome annotation

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green