Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: ZENODO
versions View all 2 versions
addClaim

Datasets of the manuscript "Rational design of profile HMMs for sensitive and specific sequence detection with case studies applied to viruses, bacteriophages, and casposons"

Authors: Liliane S. Oliveira; Alejandro Reyes; Bas E. Dutilh; Arthur Gruber;

Datasets of the manuscript "Rational design of profile HMMs for sensitive and specific sequence detection with case studies applied to viruses, bacteriophages, and casposons"

Abstract

DATASETS Rational design of profile HMMs for sensitive and specific sequence detection with case studies applied to viruses, bacteriophages, and casposons Liliane S. Oliveira, Alejandro Reyes, Bas E. Dutilh and Arthur Gruber* * Correspondence: argruber@usp.br (AG); Tel. +55 11 3091 7274 Here we provide different data of Microviridae, Flavivirus and casposons used throughout the work: Microviridae folder conserved_HMMs – profile HMMs constructed with TABAJARA in Conservation mode for Microviridae discriminative_HMMs – profile HMMs constructed with TABAJARA in Discrimination mode for Microviridae sequences – different sequence datasets and respective multiple sequence alignments Microviridae_113-seq_training_set.fasta - 113 VP1 sequences covering diversity of the Microviridae family Microviridae_113-seq.aln – multiple sequence alignment of the 113-protein dataset Microviridae_1836-seq_testset.fasta - 1,836 sequence dataset covering 1,836 sequences of the major capsid protein (VP1) comprising 501 Alpavirinae sequences, 1,040 Gokushovirinae sequences and 295 Pichovirinae sequences Microviridae_1866-seq.aln - multiple sequence alignment of the 1,866-protein Microviridae dataset used in the experiment of Figure 4 Flavivirus folder conserved_HMMs – profile HMMs constructed with TABAJARA in Conservation mode for Flavivirus discriminative_HMMs – profile HMMs constructed with TABAJARA in Discrimination mode for Flavivirus full-length – models constructed from full-length protein sequences short - models constructed from selected short alignment blocks of the protein sequences sequences – different sequence datasets and respective multiple sequence alignments Flavivirus_127-seq_training_set.fasta - 127 polyprotein sequences covering species diversity of the genus Flavivirus Flavivirus_127-seq.aln – multiple sequence alignment of the 127-protein dataset Flavivirus_6364-seq_testset.fasta - 6,364 sequence dataset covering species diversity of Flavivirus, including 3,919 of dengue virus (DENV), 327 of Zika virus (ZIKV), 63 of yellow fever virus (YFV), and the remaining 2,055 sequences covering other available flaviviruses Flavivirus_6364-seq.aln - multiple sequence alignment of the 6,364-protein Flavivirus dataset Casposons folder casposon_generic_HMMs – profile HMMs constructed with TABAJARA in Discrimination mode for the generic detection of all casposons and discrimination from CRISPRs. casposon_family_discriminative_HMMs – profile HMMs constructed with TABAJARA in Discrimination mode for the specific discrimination among casposon families and from CRISPRs. sequences – different sequence datasets and respective multiple sequence alignments casposons_crisprs.fasta – 106 Cas1 bona fide sequences derived from 52 CRISPRs and 54 casposons casposon_family_discrimination.aln - multiple sequence alignment of 52 bona fide CRISPR and 54 casposon sequences, with appropriate nomenclature to run TABAJARA for the discrimination of each casposon family. casposons_crisprs_discrimination.aln - multiple sequence alignment of 52 bona fide CRISPR and 54 casposon sequences, with appropriate nomenclature to run TABAJARA for discrimination of CRISPRs and casposons.

Keywords

Profile HMMs, Microviridae, Flavivirus, casposons

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 5
  • 5
    views
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
0
Average
Average
Average
5