
The code to the paper https://www.nature.com/articles/s42256-024-00872-0 Python was used for the model, performance assessment and data generation. R was used for scripting and data visualisation. All input data for the R scripts are separately provided, so that the data-intense and more intense computational steps do not have to be repeated. For the Python code, the folder finetuning_tasks has to be combined after decompression. It had to be split into four folders due to uploading problems. A tutorial on how to use GROVER as a foundation model can be found at: https://doi.org/10.5281/zenodo.8373159 The pretrained model can be found at: https://doi.org/10.5281/zenodo.8373117 The data for the tokenised genome are at: https://doi.org/10.5281/zenodo.8373053
GROVER, DNA Language Models
GROVER, DNA Language Models
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
