Katana VentraIP

DNA methylation

DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter, DNA methylation typically acts to repress gene transcription. In mammals, DNA methylation is essential for normal development and is associated with a number of key processes including genomic imprinting, X-chromosome inactivation, repression of transposable elements, aging, and carcinogenesis.

As of 2016, two nucleobases have been found on which natural, enzymatic DNA methylation takes place: adenine and cytosine. The modified bases are N6-methyladenine,[1] 5-methylcytosine[2] and N4-methylcytosine.[3]


Cytosine methylation is widespread in both eukaryotes and prokaryotes, even though the rate of cytosine DNA methylation can differ greatly between species: 14% of cytosines are methylated in Arabidopsis thaliana, 4% to 8% in Physarum,[4] 7.6% in Mus musculus, 2.3% in Escherichia coli, 0.03% in Drosophila; methylation is essentially undetectable in Dictyostelium;[5][6] and virtually absent (0.0002 to 0.0003%) from Caenorhabditis[7] or fungi such as Saccharomyces cerevisiae and S. pombe (but not N. crassa).[8][9]: 3699  Adenine methylation has been observed in bacterial, plant, and recently in mammalian DNA,[10][11] but has received considerably less attention.


Methylation of cytosine to form 5-methylcytosine occurs at the same 5 position on the pyrimidine ring where the DNA base thymine's methyl group is located; the same position distinguishes thymine from the analogous RNA base uracil, which has no methyl group. Spontaneous deamination of 5-methylcytosine converts it to thymine. This results in a T:G mismatch. Repair mechanisms then correct it back to the original C:G pair; alternatively, they may substitute A for G, turning the original C:G pair into a T:A pair, effectively changing a base and introducing a mutation. This misincorporated base will not be corrected during DNA replication as thymine is a DNA base. If the mismatch is not repaired and the cell enters the cell cycle the strand carrying the T will be complemented by an A in one of the daughter cells, such that the mutation becomes permanent. The near-universal use of thymine exclusively in DNA and uracil exclusively in RNA may have evolved as an error-control mechanism, to facilitate the removal of uracils generated by the spontaneous deamination of cytosine.[12] DNA methylation as well as many of its contemporary DNA methyltransferases have been thought to evolve from early world primitive RNA methylation activity and is supported by several lines of evidence.[13]


In plants and other organisms, DNA methylation is found in three different sequence contexts: CG (or CpG), CHG or CHH (where H correspond to A, T or C). In mammals however, DNA methylation is almost exclusively found in CpG dinucleotides, with the cytosines on both strands being usually methylated. Non-CpG methylation can however be observed in embryonic stem cells,[14][15][16] and has also been indicated in neural development.[17] Furthermore, non-CpG methylation has also been observed in hematopoietic progenitor cells, and it occurred mainly in a CpApC sequence context.[18]

In plants[edit]

Significant progress has been made in understanding DNA methylation in the model plant Arabidopsis thaliana. DNA methylation in plants differs from that of mammals: while DNA methylation in mammals mainly occurs on the cytosine nucleotide in a CpG site, in plants the cytosine can be methylated at CpG, CpHpG, and CpHpH sites, where H represents any nucleotide but not guanine.[74] Overall, Arabidopsis DNA is highly methylated, mass spectrometry analysis estimated 14% of cytosines to be modified.[9]: abstract  Later, bisulfite sequencing data estimated that around 25% of Arabidopsis CG sites are methylated, but these levels vary based on the geographic location of Arabidopsis accessions (plants in the north are more highly methylated than southern accessions).[75]


The principal Arabidopsis DNA methyltransferase enzymes, which transfer and covalently attach methyl groups onto DNA, are DRM2, MET1, and CMT3. Both the DRM2 and MET1 proteins share significant homology to the mammalian methyltransferases DNMT3 and DNMT1, respectively, whereas the CMT3 protein is unique to the plant kingdom. There are currently two classes of DNA methyltransferases: 1) the de novo class or enzymes that create new methylation marks on the DNA; 2) a maintenance class that recognizes the methylation marks on the parental strand of DNA and transfers new methylation to the daughter strands after DNA replication. DRM2 is the only enzyme that has been implicated as a de novo DNA methyltransferase. DRM2 has also been shown, along with MET1 and CMT3 to be involved in maintaining methylation marks through DNA replication.[76] Other DNA methyltransferases are expressed in plants but have no known function (see the Chromatin Database).


Genome-wide levels of DNA methylation vary widely between plant species, and Arabidopsis cytosines tend to be less densely methylated than those in other plants. For example, ~92.5% of CpG cytosines are methylated in Beta vulgaris.[77] The patterns of methylation also differ between cytosine sequence contexts; universally, CpG methylation is higher than CHG and CHH methylation, and CpG methylation can be found in both active genes and transposable elements, while CHG and CHH are usually relegated to silenced transposable elements.[78][74]


It is not clear how the cell determines the locations of de novo DNA methylation, but evidence suggests that for many (though not all) locations, RNA-directed DNA methylation (RdDM) is involved. In RdDM, specific RNA transcripts are produced from a genomic DNA template, and this RNA forms secondary structures called double-stranded RNA molecules.[79] The double-stranded RNAs, through either the small interfering RNA (siRNA) or microRNA (miRNA) pathways direct de-novo DNA methylation of the original genomic location that produced the RNA.[79] This sort of mechanism is thought to be important in cellular defense against RNA viruses and/or transposons, both of which often form a double-stranded RNA that can be mutagenic to the host genome. By methylating their genomic locations, through an as yet poorly understood mechanism, they are shut off and are no longer active in the cell, protecting the genome from their mutagenic effect. Recently, it was described that methylation of the DNA is the main determinant of embryogenic cultures formation from explants in woody plants and is regarded the main mechanism that explains the poor response of mature explants to somatic embryogenesis in the plants (Isah 2016).

In fungi[edit]

Many fungi have low levels (0.1 to 0.5%) of cytosine methylation, whereas other fungi have as much as 5% of the genome methylated.[88] This value seems to vary both among species and among isolates of the same species.[89] There is also evidence that DNA methylation may be involved in state-specific control of gene expression in fungi. However, at a detection limit of 250 attomoles by using ultra-high sensitive mass spectrometry DNA methylation was not confirmed in single cellular yeast species such as Saccharomyces cerevisiae or Schizosaccharomyces pombe, indicating that yeasts do not possess this DNA modification.[9]: abstract 


Although brewers' yeast (Saccharomyces), fission yeast (Schizosaccharomyces), and Aspergillus flavus[90] have no detectable DNA methylation, the model filamentous fungus Neurospora crassa has a well-characterized methylation system.[91] Several genes control methylation in Neurospora and mutation of the DNA methyl transferase, dim-2, eliminates all DNA methylation but does not affect growth or sexual reproduction. While the Neurospora genome has very little repeated DNA, half of the methylation occurs in repeated DNA including transposon relics and centromeric DNA. The ability to evaluate other important phenomena in a DNA methylase-deficient genetic background makes Neurospora an important system in which to study DNA methylation.

In other eukaryotes[edit]

DNA methylation is largely absent from Dictyostelium discoidium[92] where it appears to occur at about 0.006% of cytosines.[6] In contrast, DNA methylation is widely distributed in Physarum polycephalum[93] where 5-methylcytosine makes up as much as 8% of total cytosine[4]

is a very sensitive and reliable analytical method to detect DNA methylation. MS, in general, is however not informative about the sequence context of the methylation, thus limited in studying the function of this DNA modification.

Mass spectrometry

which is based on a chemical reaction of sodium bisulfite with DNA that converts unmethylated cytosines of CpG dinucleotides to uracil or UpG, followed by traditional PCR.[100] However, methylated cytosines will not be converted in this process, and primers are designed to overlap the CpG site of interest, which allows one to determine methylation status as methylated or unmethylated.

Methylation-Specific PCR (MSP)

also known as BS-Seq, which is a high-throughput genome-wide analysis of DNA methylation. It is based on the aforementioned sodium bisulfite conversion of genomic DNA, which is then sequenced on a Next-generation sequencing platform. The sequences obtained are then re-aligned to the reference genome to determine the methylation status of CpG dinucleotides based on mismatches resulting from the conversion of unmethylated cytosines into uracil.

Whole genome bisulfite sequencing

Enzymatic methyl-seq (EM-seq) works similarly to bisulfite sequencing, but uses enzymes, and TET2, to deaminate unmethylated cytosine into uracil prior to sequencing. EM-seq libraries are less prone to DNA damage than bisulfite-treated libraries.[101]

APOBEC

also known as RRBS knows several working protocols. The first RRBS protocol was called RRBS and aims for around 10% of the methylome, a reference genome is needed. Later came more protocols that were able to sequence a smaller portion of the genome and higher sample multiplexing. EpiGBS was the first protocol where you could multiplex 96 samples in one lane of Illumina sequencing and were a reference genome was no longer needed. A de novo reference construction from the Watson and Crick reads made population screening of SNP's and SMP's simultaneously a fact.

Reduced representation bisulfite sequencing

The , which is based on restriction enzymes' differential ability to recognize and cleave methylated and unmethylated CpG DNA sites.

HELP assay

which is based on a new type of enzymes – site-specific methyl-directed DNA endonucleases, which hydrolyze only methylated DNA.

GLAD-PCR assay

assays, which is based on the ability of commercially prepared antibodies to bind to DNA methylation-associated proteins like MeCP2.

ChIP-on-chip

a complicated and now rarely used assay based upon restriction enzymes' differential recognition of methylated and unmethylated CpG sites; the assay is similar in concept to the HELP assay.

Restriction landmark genomic scanning

(MeDIP), analogous to chromatin immunoprecipitation, immunoprecipitation is used to isolate methylated DNA fragments for input into DNA detection methods such as DNA microarrays (MeDIP-chip) or DNA sequencing (MeDIP-seq).

Methylated DNA immunoprecipitation

of bisulfite treated DNA. This is the sequencing of an amplicon made by a normal forward primer but a biotinylated reverse primer to PCR the gene of choice. The Pyrosequencer then analyses the sample by denaturing the DNA and adding one nucleotide at a time to the mix according to a sequence given by the user. If there is a mismatch, it is recorded and the percentage of DNA for which the mismatch is present is noted. This gives the user a percentage of methylation per CpG island.

Pyrosequencing

Molecular break light assay for DNA adenine methyltransferase activity – an assay that relies on the specificity of the restriction enzyme DpnI for fully methylated (adenine methylation) GATC sites in an oligonucleotide labeled with a fluorophore and quencher. The adenine methyltransferase methylates the oligonucleotide making it a substrate for DpnI. Cutting of the oligonucleotide by DpnI gives rise to a fluorescence increase.[103]

[102]

Methyl Sensitive Southern Blotting is similar to the HELP assay, although uses Southern blotting techniques to probe gene-specific differences in methylation using restriction digests. This technique is used to evaluate local methylation near the binding site for the probe.

MethylCpG Binding Proteins (MBPs) and fusion proteins containing just the Methyl Binding Domain (MBD) are used to separate native DNA into methylated and unmethylated fractions. The percentage methylation of individual CpG islands can be determined by quantifying the amount of the target in each fraction. Extremely sensitive detection can be achieved in FFPE tissues with abscription-based detection.

Analysis (HRM or HRMA), is a post-PCR analytical technique. The target DNA is treated with sodium bisulfite, which chemically converts unmethylated cytosines into uracils, while methylated cytosines are preserved. PCR amplification is then carried out with primers designed to amplify both methylated and unmethylated templates. After this amplification, highly methylated DNA sequences contain a higher number of CpG sites compared to unmethylated templates, which results in a different melting temperature that can be used in quantitative methylation detection.[104][105]

High Resolution Melt

Ancient DNA methylation reconstruction, a method to reconstruct high-resolution DNA methylation from ancient DNA samples. The method is based on the natural degradation processes that occur in ancient DNA: with time, methylated cytosines are degraded into thymines, whereas unmethylated cytosines are degraded into uracils. This asymmetry in degradation signals was used to reconstruct the full methylation maps of the and the Denisovan.[106] In September 2019, researchers published a novel method to infer morphological traits from DNA methylation data. The authors were able to show that linking down-regulated genes to phenotypes of monogenic diseases, where one or two copies of a gene are perturbed, allows for ~85% accuracy in reconstructing anatomical traits directly from DNA methylation maps.[107]

Neanderthal

Methylation Sensitive Single Nucleotide Primer Extension Assay (msSNuPE), which uses internal primers annealing straight 5' of the nucleotide to be detected.

[108]

measures locus-specific DNA methylation using array hybridization. Bisulfite-treated DNA is hybridized to probes on "BeadChips." Single-base base extension with labeled probes is used to determine methylation status of target sites.[109] In 2016, the Infinium MethylationEPIC BeadChip was released, which interrogates over 850,000 methylation sites across the human genome.[110]

Illumina Methylation Assay

DNA methylation can be detected by the following assays currently used in scientific research:[99]

Differentially methylated regions (DMRs)[edit]

Differentially methylated regions, which are genomic regions with different methylation statuses among multiple samples (tissues, cells, individuals or others), are regarded as possible functional regions involved in gene transcriptional regulation. The identification of DMRs among multiple tissues (T-DMRs) provides a comprehensive survey of epigenetic differences among human tissues.[111] For example, these methylated regions that are unique to a particular tissue allow individuals to differentiate between tissue type, such as semen and vaginal fluid. Current research conducted by Lee et al., showed DACT1 and USP49 positively identified semen by examining T-DMRs.[112] The use of T-DMRs has proven useful in the identification of various body fluids found at crime scenes. Researchers in the forensic field are currently seeking novel T-DMRs in genes to use as markers in forensic DNA analysis. DMRs between cancer and normal samples (C-DMRs) demonstrate the aberrant methylation in cancers.[113] It is well known that DNA methylation is associated with cell differentiation and proliferation.[114] Many DMRs have been found in the development stages (D-DMRs)[115] and in the reprogrammed progress (R-DMRs).[116] In addition, there are intra-individual DMRs (Intra-DMRs) with longitudinal changes in global DNA methylation along with the increase of age in a given individual.[117] There are also inter-individual DMRs (Inter-DMRs) with different methylation patterns among multiple individuals.[118]


QDMR (Quantitative Differentially Methylated Regions) is a quantitative approach to quantify methylation difference and identify DMRs from genome-wide methylation profiles by adapting Shannon entropy.[119] The platform-free and species-free nature of QDMR makes it potentially applicable to various methylation data. This approach provides an effective tool for the high-throughput identification of the functional regions involved in epigenetic regulation. QDMR can be used as an effective tool for the quantification of methylation difference and identification of DMRs across multiple samples.[120]


Gene-set analysis (a.k.a. pathway analysis; usually performed tools such as DAVID, GoSeq or GSEA) has been shown to be severely biased when applied to high-throughput methylation data (e.g. MeDIP-seq, MeDIP-ChIP, HELP-seq etc.), and a wide range of studies have thus mistakenly reported hyper-methylation of genes related to development and differentiation; it has been suggested that this can be corrected using sample label permutations or using a statistical model to control for differences in the numbers of CpG probes / CpG sites that target each gene.[121]

Computational prediction[edit]

DNA methylation can also be detected by computational models through sophisticated algorithms and methods. Computational models can facilitate the global profiling of DNA methylation across chromosomes, and often such models are faster and cheaper to perform than biological assays. Such up-to-date computational models include Bhasin, et al.,[130] Bock, et al.,[131] and Zheng, et al.[132][133] Together with biological assay, these methods greatly facilitate the DNA methylation analysis.

5-Hydroxymethylcytosine

5-Methylcytosine

7-Methylguanosine

a plant methylation gene

Decrease in DNA Methylation I (DDM1)

Demethylating agent

Differentially methylated regions

DNA demethylation

DNA methylation reprogramming

of which DNA methylation is a significant contributor

Epigenetics

a method to calculate age based on DNA methylation

Epigenetic clock

Epigenome

Genome

an inherited repression of an allele, relying on DNA methylation

Genomic imprinting

DNA Methylation database hosted on the UCSC Genome Browser

MethBase

DNA Methylation database

MethDB

N6-Methyladenosine

Protein methylation

at the U.S. National Library of Medicine Medical Subject Headings (MeSH)

DNA+Methylation

Non-coding RNA characterization. Nature (journal)

ENCODE threads explorer

Pancreatic Cancer Methylation Database.

PCMdb

Specific Methylation Analysis and Report Tool

SMART

Human Methylation Mark Atlas

Archived 2020-01-27 at the Wayback Machine Human disease methylation database

DiseaseMeth

A knowledgebase of epigenome-wide association studies

EWAS Atlas