
The Human Reference Genome serves as the foundation for modern genomic analyses. However, in its present form, it does not adequately represent the vast genetic diversity of the human population. In this study, we explored the consensus genome as a potential successor of the current reference genome and assessed its effect on the accuracy of RNA-seq read alignment. To find the best haploid genome representation, we constructed consensus genomes at the pan-human, superpopulation, and population levels, using variant information from The 1000 Genomes Project Consortium. Using personal haploid genomes as the ground truth, we compared mapping errors for real RNA-seq reads aligned to the consensus genomes versus the reference genome. For reads overlapping homozygous variants, we found that the mapping error decreased by a factor of approximately two to three when the reference was replaced with the pan-human consensus genome. We also found that using more population-specific consensuses resulted in little to no increase over using the pan-human consensus, suggesting a limit in the utility of incorporating a more specific genomic variation. Replacing the reference with consensus genomes impacts functional analyses, such as differential expressions of isoforms, genes, and splice junctions.
570, Genome, Consensus, Genome, Human, Human Genome, Method, Genomics, 3105 Genetics, 576, 3102 Bioinformatics and Computational Biology, anzsrc-for: 3105 Genetics, Cancer Genomics, anzsrc-for: 11 Medical and Health Sciences, Exome Sequencing, Genetics, anzsrc-for: 06 Biological Sciences, Humans, RNA-Seq, anzsrc-for: 31 Biological Sciences, anzsrc-for: 3102 Bioinformatics and Computational Biology, 31 Biological Sciences, Cancer, Human
570, Genome, Consensus, Genome, Human, Human Genome, Method, Genomics, 3105 Genetics, 576, 3102 Bioinformatics and Computational Biology, anzsrc-for: 3105 Genetics, Cancer Genomics, anzsrc-for: 11 Medical and Health Sciences, Exome Sequencing, Genetics, anzsrc-for: 06 Biological Sciences, Humans, RNA-Seq, anzsrc-for: 31 Biological Sciences, anzsrc-for: 3102 Bioinformatics and Computational Biology, 31 Biological Sciences, Cancer, Human
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 13 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
