Katana VentraIP

Repeated sequence (DNA)

Repeated sequences (also known as repetitive elements, repeating units or repeats) are short or long patterns of nucleic acids (DNA or RNA) that occur in multiple copies throughout the genome. In many organisms, a significant fraction of the genomic DNA is repetitive, with over two-thirds of the sequence consisting of repetitive elements in humans.[1] Some of these repeated sequences are necessary for maintaining important genome structures such as telomeres or centromeres.[2]

Repeated sequences are categorized into different classes depending on features such as structure, length, location, origin, and mode of multiplication. The disposition of repetitive elements throughout the genome can consist either in directly adjacent arrays called tandem repeats or in repeats dispersed throughout the genome called interspersed repeats.[3] Tandem repeats and interspersed repeats are further categorized into subclasses based on the length of the repeated sequence and/or the mode of multiplication.


While some repeated DNA sequences are important for cellular functioning and genome maintenance, other repetitive sequences can be harmful. Many repetitive DNA sequences have been linked to human diseases such as Huntington's disease and Friedreich's ataxia. Some repetitive elements are neutral and occur when there is an absence of selection for specific sequences depending on how transposition or crossing over occurs.[2] However, an abundance of neutral repeats can still influence genome evolution as they accumulate over time. Overall, repeated sequences are an important area of focus because they can provide insight into human diseases and genome evolution.[2]

History[edit]

In the 1950s, Barbara McClintock first observed DNA transposition and illustrated the functions of the centromere and telomere at the Cold Spring Harbor Symposium.[4] McClintock's work set the stage for the discovery of repeated sequences because transposition, centromere structure, and telomere structure are all possible through repetitive elements, yet this was not fully understood at the time. The term "repeated sequence" was first used by Roy John Britten and D. E. Kohne in 1968; they found out that more than half of the eukaryotic genomes were repetitive DNA through their experiments on reassociation of DNA.[5] Although the repetitive DNA sequences were conserved and ubiquitous, their biological role was yet unknown. In the 1990s, more research was conducted to elucidate the evolutionary dynamics of minisatellite and microsatellite repeats because of their importance in DNA-based forensics and molecular ecology. DNA-dispersed repeats were increasingly recognized as a potential source of genetic variation and regulation. Discoveries of deleterious repetitive DNA-related diseases stimulated further interest in this area of study.[6] In the 2000s, the data from full eukaryotic genome sequencing enabled the identification of different promoters, enhancers, and regulatory RNAs which are all coded by repetitive regions. Today, the structural and regulatory roles of repetitive DNA sequences remain an active area of research.

Biotechnology[edit]

Repetitive DNA is hard to sequence using next-generation sequencing techniques because sequence assembly from short reads simply cannot determine the length of a repetitive part. This issue is particularly serious for microsatellites, which are made of tiny 1-6bp repeat units.[43] Although they are difficult to sequence, these short repeats have great value in DNA fingerprinting and evolutionary studies. Many researchers have historically left out repetitive sequences when analyzing and publishing whole genome data due to technical limitations.[44]


Bustos. et al. proposed one method of sequencing long stretches of repetitive DNA.[43] The method combines the use of a linear vector for stabilization and exonuclease III for deletion of continuing simple sequence repeats (SSRs) rich regions. First, SSR-rich fragments are cloned into a linear vector that can stably incorporate tandem repeats up to 30kb. Expression of repeats is prohibited by the transcriptional terminators in the vector. The second step involves the use of exonuclease III. The enzyme can delete nucleotide at the 3' end which results in the production of a unidirectional deletion of SSR fragments. Finally, this product which has deleted fragments is multiplied and analyzed with colony PCR. The sequence is then built by an ordered sequencing of a set of clones containing different deletions.

Function of Repetitive DNA

at the U.S. National Library of Medicine Medical Subject Headings (MeSH)

DNA+Repetitious+Region