
In this Supplementary Data, we provide 50 species reference proteome sequences and their corresponding frameshifted sequences in .fasta format in the folders reference and frameshifted. The list of proteome accessions is shown in the file proteome_list.csv. Raw data of repeat region predictions with the MRF tool are provided in files MRF_reference.csv and MRF_frameshifted.csv. Intrinsically disordered region prediction by the IUPred tool overlapped with repeat region prediction of the MRF tool for reference proteomes, and their frameshifted sequences are provided in files overlap_IUPred_reference.csv and overlap_IUPred_frameshifted.csv, respectively. Description of column headers in the MRF_frameshifted.csv and MRF_reference.csv files is below: Proteome_id - UniProtKB accession number of the proteome id - RefSeq/Ensembl ID of sequence/cDNA with custom extension. For example, the identifier NP_001245494.2_ORF-F-1_0-3832_3833 is structured as follows: NP_001245494.2 refers to the RefSeq ID, ORF indicates the open reading frame, F-1 shows the reading frame (with 1 representing the reference frame), 0–3832 specifies the start and end positions of the sequence, 3833 indicates the total sequence length In frameshifted sequences, the format remains consistent, but the frame is indicated as F-2 or F-3, corresponding to frame -1 or +1, respectively. For instance, NP_001245494.2_ORF-F-2_720-801_82 refers to a region in the -1 frameshift. start - start position of repeat region end - start position of repeat region repeat_len - length of repeat unit repeat_no - number of repeat units Description of column headers in the overlap_IUPred_reference.csv and overlap_IUPred_frameshifted.csv is below: Proteome_id - UniProtKB accession number of the proteome ID - RefSeq/Ensembl ID of sequence/cDNA with custom extension mrf_start - start position of repeat region mrf_end - start position of repeat region repeat_len - length of repeat unit repeat_no- number of repeat units iupred_start - start position of intrinsically disordered region Iupred_end - end position of intrinsically disordered region overlap_length - length of repeat region overlap with intrinsically disordered region overlap_start - start position of repeat region overlaps with intrinsically disordered region overlap_end - start position of repeat region overlaps with intrinsically disordered region
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
