
To use a novel computational approach, Key-string Algorithm (KSA), for the identification and analysis of arbitrarily large repetitive sequences and higher-order repeats (HORs) in noncoding DNA. This approach is based on the use of key string that plays a role of an arbitrarily constructed "computer enzyme".A cluster of novel KSA-related methods was introduced and developed on the basis of a combination of computations on a very modest scale, by eye inspection and graphical display of results of analysis. Sequence analysis software was developed, containing seven programs for KSA-related analyses. This approach was demonstrated in the case study of alpha satellites and HORs in the human genetic sequence AC017075.8 (193277 bp) from the centromeric region of human chromosome 7. The KSA segmentation method was applied by using DCCGTTT, GTA, and TTTC key strings.Fifty-five copies of 2734-bp 16mer HORs were identified and investigated, and a start-string TTTTTTAAAAA was identified. The HOR-matrix was constructed and employed for graphical display of mutations. KSA identification of HORs in AC017075.8 was compared with that of RepeatMasker and Tandem Repeat Finder, which identified alpha monomers in AC017075.8, but not the HORs. On the basis of KSA study, the centromere folding was described as an effect of HORs and super-HORs (3 x 2734 bp) in AC017075.8. The following novel computational KSA-based methods, easy-to-use and intended for computational "pedestrians", were demonstrated: color-HOR diagram, KSA-divergence method, 171-bp subsequence-convergence diagram, and total frequency distribution of the key-string subsequence lengths. The results were supplemented by Fast Fourier Transform, employing a novel mapping of symbolic genomic sequence into a numerical sequence.The KSA approach offers a simple and robust framework for a wide range of investigations of large repetitive sequences and HORs, involving a very modest scope of computations that can be carried out by using a PC. As the KSA method is HOR-oriented, the identification of HORs is even easier than the identification of underlying alpha monomer itself. This approach provides an easy identification of point mutations, insertions, and deletions, with respect to consensus. This may be useful in a wide range of investigations and applied in forensic medicine, medical diagnosis of malignant diseases, biological evolution, and paleontology.
chromosomes, Centromere, Molecular Sequence Data, repetitive sequences, molecular sequence data, DNA, Satellite, Polymerase Chain Reaction, Sensitivity and Specificity, algorithm, alpha satellite DNA, computational biology, Animals, Humans, Molecular Biology, algorithm, Base Sequence, DNA satellite, Chromosome Mapping, Computational Biology, alpha satellite DNA, Sequence Analysis, DNA, human pair 7, Fourier analysis, nucleic acids, centromere; chromosomes; human pair 7; computational biology; DNA satellite; Fourier analysis; molecular sequence data; repetitive sequences; nucleic acids; point mutation, centromere, Tandem Repeat Sequences, Nucleic Acid Conformation, point mutation, Algorithms, Chromosomes, Human, Pair 7
chromosomes, Centromere, Molecular Sequence Data, repetitive sequences, molecular sequence data, DNA, Satellite, Polymerase Chain Reaction, Sensitivity and Specificity, algorithm, alpha satellite DNA, computational biology, Animals, Humans, Molecular Biology, algorithm, Base Sequence, DNA satellite, Chromosome Mapping, Computational Biology, alpha satellite DNA, Sequence Analysis, DNA, human pair 7, Fourier analysis, nucleic acids, centromere; chromosomes; human pair 7; computational biology; DNA satellite; Fourier analysis; molecular sequence data; repetitive sequences; nucleic acids; point mutation, centromere, Tandem Repeat Sequences, Nucleic Acid Conformation, point mutation, Algorithms, Chromosomes, Human, Pair 7
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
