Downloads provided by UsageCounts
This dataset consists of a tarball archive containing 8,192 tab-delimited files (one per 7-mer sequence motif). Each file contains information about the status or value of 15 different genomic features at every possible site in hg19, centered at the 7-mer sequence motif indicated in the filename (or the reverse complement of that motif; e.g. ACGATGC_annotated.txt includes information for sites at 5’-ACGATGC-3’ and sites at 5’-GCATCGT-3’ motifs). Each file contains the following columns: AT_CG [indicator if site carries an A>C or T>G singleton (1) or not (0) in the BRIDGES data] AT_GC [indicator if site carries an A>G or T>C singleton (1) or not (0) in the BRIDGES data] AT_TA [indicator if site carries an A>T or T>A singleton (1) or not (0) in the BRIDGES data] GC_AT [indicator if site carries a G>A or C>T singleton (1) or not (0) in the BRIDGES data] GC_CG [indicator if site carries a G>C or C>G singleton (1) or not (0) in the BRIDGES data] GC_TA [indicator if site carries a G>T or C>A singleton (1) or not (0) in the BRIDGES data] DP [average depth of coverage at site] H3K4me1 [indicator if site is within a H3K4me1 broad peak (1) or not (0)] H3K4me3 [indicator if site is within a H3K4me3 broad peak (1) or not (0)] H3K9ac [indicator if site is within a H3K9ac broad peak (1) or not (0)] H3K9me3 [indicator if site is within a H3K9me3 broad peak (1) or not (0)] H3K27ac [indicator if site is within a H3K27ac broad peak (1) or not (0)] H3K27me3 [indicator if site is within a H3K27me3 broad peak (1) or not (0)] H3K36me3 [indicator if site is within a H3K36me3 broad peak (1) or not (0)] EXON [indicator if site is within an exon (1) or not (0)] CpGI [indicator if site is within a CpG island (1) or not (0)] RR [average recombination rate in the 10kb window centered at the site] LAMIN [indicator if site is within an Lamin-Associated Domain (1) or not (0)] DHS [indicator if site is within a DNase Hypersensitive region (1) or not (0)] TIME [average recombination rate in the 10kb window centered at the site] GC [average GC content in the 10kb window centered at the site] Note that the chromosome and position of each site has been removed to protect sample privacy. Each file is then passed to an R script (available at https://github.com/carjed/smaug-genetics) to estimate the effects each feature on the relative mutation rate using a logistic regression model (e.g., AT_GC ~ DP + ... + GC). Each of the features used is available from data in the public domain; the provenance of these features is described in the associated paper, and additional scripts for processing the feature data can be found at at https://github.com/carjed/smaug-genetics. The BRIDGES whole-genome sequencing study is described at https://doi.org/10.1101/108290
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 15 | |
| downloads | 1 |

Views provided by UsageCounts
Downloads provided by UsageCounts