
doi: 10.3390/f10100826
Research Highlights: A rigorous genome survey helped us to estimate the genomic characteristics, remove the DNA contamination, and determine the sequencing scheme of Betula platyphylla. Background and Objectives: B. platyphylla is a common tree species in northern China that has high economic and medicinal value. However, there is a lack of complete genomic information for this species, which severely constrains the progress of relevant research. The objective of this study was to survey the genome of B. platyphylla and determine the large-scale sequencing scheme of this species. Materials and Methods: Next-generation sequencing was used to survey the genome. The genome size, heterozygosity rate, and repetitive sequences were estimated by k-mer analysis. After preliminary genome assembly, sequence contamination was identified and filtered by sequence alignment. Finally, we obtained sterilized plantlets of B. platyphylla by plant tissue culture, which can be used for third-generation sequencing. Results: We estimated the genome size to be 432.9 Mb and the heterozygosity rate to be 1.22%, with repetitive sequences accounting for 62.2%. Bacterial contamination was observed in the leaves taken from the field, and most of the contaminants may be from the genus Mycobacterium. A total of 249,784 simple sequence repeat (SSR) loci were also identified in the B. platyphylla genome. Among the SSRs, only 11,326 can be used as candidates to distinguish the three Betula species. Conclusions: The B. platyphylla genome is complex and highly heterozygous and repetitive. Higher-depth third-generation sequencing may yield better assembly results. Sterilized plantlets can be used for sequencing to avoid contamination.
genome survey sequencing, sequence contamination, <i>Betula platyphylla</i>, SSR
genome survey sequencing, sequence contamination, <i>Betula platyphylla</i>, SSR
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 12 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
