
AbstractThe aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this article, we describe significant updates that we have made over the last two years to the resource. The number of sequences in UniProtKB has risen to approximately 190 million, despite continued work to reduce sequence redundancy at the proteome level. We have adopted new methods of assessing proteome completeness and quality. We continue to extract detailed annotations from the literature to add to reviewed entries and supplement these in unreviewed entries with annotations provided by automated systems such as the newly implemented Association-Rule-Based Annotator (ARBA). We have developed a credit-based publication submission interface to allow the community to contribute publications and annotations to UniProt entries. We describe how UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.
Proteomics, COVID-19 / virology, Computational Biology / methods, Molecular Sequence Annotation / methods, Proteome, Knowledge Bases, Proteomics / methods, SARS-CoV-2 / physiology, COVID-19 / epidemiology, User-Computer Interface, Viral Proteins, SARS-CoV-2 / genetics, Data Curation / methods, Genetics, Viral Proteins / genetics, Database Issue, Humans, SARS-CoV-2 / metabolism, Databases, Protein, Pandemics, Data Curation, Proteome / metabolism, Proteome / genetics, Internet, 616.0757, SARS-CoV-2, COVID-19 / prevention & control, COVID-19, Computational Biology, Molecular Sequence Annotation, Viral Proteins / metabolism, ddc: ddc:616.0757
Proteomics, COVID-19 / virology, Computational Biology / methods, Molecular Sequence Annotation / methods, Proteome, Knowledge Bases, Proteomics / methods, SARS-CoV-2 / physiology, COVID-19 / epidemiology, User-Computer Interface, Viral Proteins, SARS-CoV-2 / genetics, Data Curation / methods, Genetics, Viral Proteins / genetics, Database Issue, Humans, SARS-CoV-2 / metabolism, Databases, Protein, Pandemics, Data Curation, Proteome / metabolism, Proteome / genetics, Internet, 616.0757, SARS-CoV-2, COVID-19 / prevention & control, COVID-19, Computational Biology, Molecular Sequence Annotation, Viral Proteins / metabolism, ddc: ddc:616.0757
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5K | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 0.01% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 0.01% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 0.01% |
