A synonymous change, p.Gly16Gly in MECP2 Exon 1, causes a cryptic splice event in a Rett syndrome patient

Background Mutations in MECP2 are the main cause of Rett Syndrome. To date, no pathogenic synonymous MECP2 mutation has yet been identified. Here, we investigated a de novo synonymous variant c.48C>T (p.Gly16Gly) identified in a girl presenting with a typical RTT phenotype. Methods In silico analyses to predict the effects of sequence variation on mRNA splicing were employed, followed by sequencing and quantification of lymphocyte mRNAs from the subject for splice variants MECP2_E1 and MECP2_E2. Results Analysis of mRNA confirmed predictions that this synonymous mutation activates a splice-donor site at an early position in exon 1, leading to a deletion (r.[=, 48_63del]), codon frameshift and premature stop codon (p.Glu17Lysfs*16) for MECP2_E1. For MECP2_E2, the same premature splice site is used, but as this is located in the 5′untranslated region, no effect on the amino acid sequence is predicted. Quantitative analysis that specifically measured this cryptic splice variant also revealed a significant decrease in the quantity of the correct MECP2_E1 transcript, which indicates that this is the etiologically significant mutation in this patient. Conclusion These findings suggest that synonymous variants of MECP2 as well as other known disease genes—and de novo variants in particular— should be re-evaluated for potential effects on splicing.


Introduction
Rett syndrome (RTT; MIM#312750) is an X-linked neurological disorder which leads to gradual slowing of neurodevelopment in females. Clinical features and symptoms may include microcephaly, repetitive hand movement, scoliosis, constipation, excessive saliva, intellectual disability (ID), and typically little or no verbal skills.
Methyl-CpG binding protein 2 (MeCP2; MIM 300005) [1] located at Xq28, was identified as the gene responsible for RTT [2]. Initially, MeCP2 was identified due to its selective binding (via a methyl-CpG binding domain (MBD)) to DNA sequences that are methylated at cytosine in the dinucleotide CpG [3]. Subsequently, other highly conserved and functionally relevant domains, namely the transcriptional repression domain (TRD) and nuclear localization signal (NLS) were also identified [4,5] (Figure 1). Other functionally less well characterized domains include a C-terminal domain (CTD), which putatively binds to histone [6]. The MBD of MeCP2 does not only recognize methylated DNA, but can also bind unmethylated DNA. Additional unmethylated DNA binding sites are present between the MBD and TRD, and are known as the intervening domain (ID) (Figure 1) [7].
The canonical version of MECP2 gene consists of 4 coding exons. The translation START site is in exon 2, and exon 1 and most of exon 2 are within the 5′untranslated region (UTR). However, a splice variant of MECP2 was discovered later, namely MECP2_E1, in which exon 2 is spliced out and translation is initiated from a START codon in exon 1, thus resulting in a slightly larger protein with a different N-terminal [9,10] (Figure 1). The two isoforms are identical for the remainder of the protein, and both contain the MBD and TRD (Figure 1). Several studies suggested that MeCP2_E1 is the predominant isoform in brain tissues, with 10-fold higher expression [9,10]. MECP2_E1 is not only transcribed at much higher levels in the brain than the canonical splice variant, MECP2_E2, but is also translated more efficiently. The presence of the upstream open reading frame appears to inhibit the efficient translation of MeCP2_E2 [10]. The N-terminal region encoded by MECP2_E1 is highly conserved across all vertebrate groups, and is identical among many mammalian species [11].
There are currently over 250 known MECP2 gene mutations that cause RTT [12]. Over 80% of patients with RTT have mutations in exons 3 or 4 of MECP2. The identification of disease-relevant mutations in exon 1 led to the likelihood that MeCP2_E1 is the etiologically relevant protein isoform for Rett [9,13,14], later confirmed by studies of mouse knockouts specific to isoform MeCP2_e2 [15], and has also led to the inclusion of exon 1 in diagnostic sequencing for Rett syndrome.
Here, we have identified a novel synonymous mutation in the coding portion of exon1 of MECP2_E1 at position c.48C>T. This mutation was initially characterized as a silent mutation as it changes the 16 th codon from GGC to GGU/T, both of which code for glycine, i.e. p.Gly16Gly. In silico analysis of this synonymous variant predicts potential activation of a cryptic mRNA splice donor site upstream of the exon 1 splice donor site, which could result in a frameshift in the mRNA during protein translation and hence may ultimately result in protein truncation ( Figure 2). Here, we present molecular evidence through analysis of mRNA sequence and mRNA quantization, confirming the activation of this cryptic splice donor.

Patient ascertainment/assessment
The proband was ascertained and assessed through the Clinical Genetics Department at the Naval Medical Center San Diego, and recruited into a study of MECP2 based at the Campbell Family Mental Health Research Institute, Centre for Addiction & Mental Health, Toronto. Institutional research ethics approval was obtained for this study, and written consent for the study was given by the proband's parents. Recruitment of control subjects for mRNA quantification analysis has been described previously [16].

Bioinformatic analysis
Potential splice sites were predicted using the online neural network tool at the Berkeley Drosophila Genome Project (BDGP) (http://www.fruitfly.org/seq_tools/splice. html), also using Human Splicing Finder (HSF; www. umd.be/HSF/) and MaxEntScan (genes.mit.edu/burgelab/ maxent/Xmaxentscan_scoreseq.html). ExPASy Swiss institute of bioinformatics (http://web.expasy.org/translate) online translate tool was used to identify the potential effect of this splice site mutation on the reading frame of the mutated version of protein.

Cell culture and RNA preparation
Total RNA from the patient was extracted from a) lymphocytes drawn into Tempus™ tubes (Invitrogen Life Technologies), following the manufacturer's protocol for PureLink® RNA Mini Kit (Invitrogen Life Technologies), and b) from Epstein-Barr Virus (EBV)-transformed blood cells (lymphoblasts), grown in RPMI medium supplemented with 15% Fetal Bovine Serum (FBS) and 1X penicillin streptomycin, with RNA extracted using Trizol method (all from Invitrogen Life Technologies).

Reverse transcriptase polymerase chain reaction (RT-PCR) & Sanger Sequencing
RT-PCR analysis targeting the N-terminal coding regions for MECP2_E1 and MECP2_E2 was performed using sets of oligonucleotides designed with Primer Express 3.0 software (Applied Biosystems, Foster City, CA USA) (See Table 1). First strand cDNA was synthesized using Superscript III (Invitrogen) from RNA treated with DNase I (Fermentas). Following RT-PCR amplification across MECP2 exons 1 to 3, sequencing analysis of the gel-eluted product (Qiagen) was performed at The Centre for Applied Genomics (www.tcag.ca) using gene specific primers ( Figure 2).

TA cloning, colony screening and high resolution gel electrophoresis
Gel-eluted PCR product obtained from the previous step was cloned into the pDRIVE vector according to the manufacturer's instruction (Qiagen). Selection of recombinant colonies was primarily done using α-complementation of the β-galactosidase gene with isopropyl β-D-1-thiogalactopyranoside (IPTG) supplemented LB agar. Colony-PCR of individually picked bacterial colonies was carried out using T7 and SP6 as forward and reverse primers. PCR-amplification with forward primer 5′-CCGAGCG GAGGAGGAGGAGG-3′ from exon 1 and reverse primer 5′-TGCTTGCCCTCTTTCTCTTC-3′ (exon 3) was followed by 2% high resolution agarose gel electrophoresis at 120V for 1 hour. PCR products were seen at the expected sizes, i.e. 140 bp for the normal transcript TA clone, and 124 bp for the mutated transcript TA clone, alongside 50 bp DNA ladder (Fermentas) (Figure 2). The MECP2 exon 1-3 boundary sequence for the WT mRNA is indicated with a blue arrow, whereas MECP2 exon 1-3 boundary sequence of the Mut mRNA is indicated with a red arrow. The 16bp sequence that is deleted in the Mut transcript between the cryptic splice junction and correct splice junction is indicated within the WT sequence using red brackets. D) Illustration of the predicted amino acid sequence of the MeCP2_E1 WT (green) and Mut (orange). The change of Glu at position 17 into Lys followed by a frameshift and introduction of a stop codon after 16 amino acids is indicated.
5′ nuclease assay for relative mRNA quantification (qRT-PCR) Quantification of mutant and wild type transcripts was carried out using custom designed TaqMan® qRT-PCR gene expression assays ( Figure 3B), using cDNA from the proband and three healthy female control individuals. Applied Biosystems® TaqMan® Universal PCR Master Mix, with optimal amplification conditions were used for all assays. All samples were analyzed in triplicate (three biological replicates) and normalized using phosphoglycerate kinase 1 gene (PGK1) as an endogenous control. All assays were designed according to the manufacturer's instructions and the Primer Express 3.0 software tool (Applied Biosystems, Foster City, CA USA).

Data analysis
Quantitative analyses of the data were carried out using ViiA™ 7 System SDS software (Applied Biosystems, Foster city USA). Greater than 36 Cycle threshold (Ct) value (from a total of 40 cycles of amplification) were considered as undetected [17]. Fold-differences in gene expression were calculated using the comparative cycle threshold method (ΔΔ Ct method) [17,18]. Statistical analysis was performed to calculate standard error of means (S.E.M) and standard deviations (SD) using Microsoft Excel tool for descriptive statistics.

Results
The clinical details of the patient are reported here:

Patient 1 information:
The patient is a five-year three-month-old female born of non-consanguineous parents. She has a younger brother who is developing normally. Her parents are both healthy, cognitively normal, without history of seizures. The  patient was born at full term after an uncomplicated pregnancy. Birth weight was 50th centile, and head circumference, 25th centile. Concerns about her development arose around nine months of age when her progress seemed to slow, although without suggestion of regression. She sat at six months, walked at 18 months, began babbling at nine months of age but speech never progressed beyond that, and babbling has mostly disappeared. At two years of age she began having seizures. Cranial computed tomography (CT) and magnetic resonance imaging (MRI) at that time were normal. Electroencephalography (EEG) showed multifocal sharp waves. She developed multiple seizure types over time; tonic-clonic, complex partial, absence, and atonic seizures. Brain MRI was repeated at 4.5 years to include spectroscopy and was also normal. In terms of the revised diagnostic criteria for RTT [19], she meets all four main criteria (loss of acquired purposeful hand skills, loss of acquired spoken language, gait abnormaility and sterotpical hand movelments) and at least seven out of 11 of the supportive criteria (breathing disturbance while awake, bruxism while awake, impaired sleep pattern, abnormal muscle tone, growth retardation, diminished response to pain and intense eye communication ("eye pointing"). Hands and feet were cold, although not small. Thus she meets criteria for typical or classic RTT in all except that regression is still ongoing, and a period of recovery or stabilization has not yet been established. Thus we have designated her with a diagnosis of typical Rett syndrome.

Molecular and bioinformatics analyses:
Metabolic evaluations and karyotyping were normal. Comparative genomic hybridization identified a 560 Kb (chr14: 37,627,931-38,187,142; hg19) duplication at 14q21.1 which contains two genes; mirror-image polydactyly gene 1 (MIPOL1) which is associated with mirror image polydactyly, and Forkhead box protein A1 (FOXA1) which is a transcription factor found to be overexpressed in some tumor types. This rare gain copy number variant was also present in the father's DNA, thus excluding non-paternity. Molecular evaluations have also included normal Angelman syndrome methylation analysis, ubiquitinprotein ligase E3A (UBE3A) sequence analysis, and screening using a mitochondrial DNA point mutation panel.
MECP2 sequence analysis identified a heterozygous single nucleotide substitution from C→T in MECP2 exon 1 at nucleotide position 48 of the MECP2_E1 coding sequence (c.48C>T). This substitution was not present in either parent, and is likely to be de novo. The variant has not been reported in large scale whole exome or genome sequencing projects (1000 Genomes (phase 1 integrated release) and the NHLBI Exome Sequencing Project (ESP6500 data release)). The substitution is at the 3 rd (wobble) position of a glycine codon, and does not alter the coded amino acid (i.e. a synonymous change). According to the BDGP splice site prediction, the donor score for nucleotide positions c.47-48 is altered from <0.01 (WT) to 0.85 (Mutant; Figure 2A). Analysis using other algorithms supports this (Additional file 1: Table S1). The agarose gel electrophoresis of PCR amplified cDNA shows two bands for the MECP2_E1 transcript for the affected individual. Upon separation of the mutated version from the normal transcript by cloning into a TA cloning vector, followed by sequencing analysis, it was revealed that the transcript resulting from c.48C>T lacks 16 nucleotides at the 3′end of exon 1 (r.[=, 48_63del]) ( Figure 2B, C), indicating that c.48C>T results in a premature splicing event, as predicted. Although it splices with the correct acceptor site at intron 2/exon 3, due to the 16 nt deletion in exon 1 it is predicted to result in a frameshift event, changing glutamic acid (Glu) at position 17 into lysine (Lys), and leading to a truncated protein with a stop codon after 16 amino acids (p.Glu17Lysfs*16) ( Figure 2D).
Analysis of the effect of c.48C>T on splicing for the MECP2_E2 splice variant showed similar results, with the last 16 nt of exon 1 missing, but the correct usage of the splice acceptor at the intron 1/exon 2 boundary. As this deletion lies within 5′ untranslated region (UTR), it is not predicted to affect the 486 amino acid open reading frame of MECP2_E2, however, bioinformatic analysis shows that it results in an upstream open reading frame (ORF) 27 amino acids in length. This may putatively reduce the efficiency of translation of the full length ORF.
Further, quantification of the mutated and wild type mRNA transcripts showed a significant difference in the transcription levels. Approximately two-fold average decrease in expression of the WT MECP2_E1 transcript (MECP2_E1 NL) was observed in Patient 1, whereas mutated transcripts (MECP2_E1 MT) were not detected (ND) in the normal samples. Relative expression of the MECP2_E2 isoform was also quantified. An average of~1.4-fold increase in the expression of the WT MECP2_E2 transcripts was observed in Patient 1 ( Figure 3A).
No mutant c.48C>T alleles from the patient showing the correct splicing at the canonical exon 1 splice donor site were identified in either MECP2_E1 or E2 transcripts, and all transcripts that did show splicing at the canonical exon 1 donor site were derived from the WT allele.

Discussion
Synonymous mutations are generally considered as nonpathogenic, and are not expected to change the protein's function. This paradigm has been challenged in recent years, with evidence that changes in codon usage may have consequences for the efficiency and speed of translation, that, in turn, may then affect protein folding and function [20,21]. In addition, the possible pathogenic mechanism of single nucleotide substitutions at codon wobble sites creating aberrant splice sites, is frequently overlooked, but while there are a number of reports with evidence for this at missense (non-synonymous) sites, there are fewer reports of such a mechanism in human disease involving synonymous changes [22]. To date, no synonymous mutation in MECP2 has been proven to be pathogenic.
In this report, we have described the functional consequences of a synonymous mutation in MECP2 exon 1, c.48C>T in a girl with a typical, albeit relatively mild form of RTT. This C>T transition changes the codon GGC into GGT, coding for the same amino acid i.e. glycine (p.Gly16Gly) (Figure 2A). At the transcriptional level this silent change clearly affects the gene expression by introducing a premature splice-donor site, resulting in the removal of 16nt of coding sequence from MECP2_E1 transcripts r. [=, 48_63del], This causes a frameshift, and premature truncation is predicted after 16 amino acid residues (KKSQKTSRTNPSSLKR*) (p.Glu17Lysfs*16) ( Figure 2D). The effect of the mutation is similar for splice variant MECP2_E2, in that the last 16nt of exon 1 are removed, however, as this lies within the 5′UTR, there is no predicted effect on the protein. However, as the frame in the 5′UTR changes with the mutation and 16nt deletion, this may have an effect on the translation efficiency of MECP2_E2. The number and position of stop codons upstream to the translation start site remains unchanged, but while WT MECP2_E2 has no apparent upstream ORF, the mutant form does have a 27 amino acid upstream ORF, which may compete for translational machinery with the correct ORF beginning in exon 2. Thus, while reduction in MeCP2_E1 protein levels is the most likely etiologically relevant consequence of the mutation in Patient 1, we cannot exclude that a reduction in MeCP2_E2 protein levels caused by translational interference is also contributing to the phenotype.
Reduced MeCP2 expression is likely one of the main pathogenic mechanisms in RTT [23]. Quantitative analysis of MECP2_E1 and E2 mRNA for Patient 1 suggest that this aberrant splice event is specific to the patient, and occurs in roughly half of the transcripts during post-transcriptional processing. A modest increase in MECP2_E2 transcripts were observed in Patient 1 (Figure 3). It is possible that removal of the 16nt at the end of exon 1 may remove inhibitory sequences, permitting increased transcription of the E2 mRNA, which could possibly be partially compensating for the loss of MECP2_E1, resulting in a milder RTT phenotype. Also, skewed X-inactivation may favor the WT allele in Patient 1, however information on X-inactivation was not available for this patient.
In summary, we have found a synonymous substitution in MECP2 exon 1 coding region, which results in a splicing defect, which is predicted to lead to a truncated protein. We cannot, however, exclude the possibility that other mechanisms are also involved, for instance the possible effect on translation timing and efficiency (and thus protein folding and function) of the switch in codon usage at Gly16 from a high to low frequency codon [20,21], or a contributory effect of translational competition for the MECP2_E2 splice variant from the mutated allele. We recommend the re-evaluation of all de novo synonymous substitutions in MECP2. In particular, through in silico analysis of all silent changes in MECP2 reported on the Rettbase website and in the NHLBI ESP6500 exome sequence database, we found that the change c.627G>A (p.Val209Val) increases the donor splice site prediction score (Splice Site Prediction by Neural Network [24]) from 0.36 to 0.91 (out of 1.0), c.948C>G (Val316Val) changes the score from 0.09 to 0.42., and c.999G>T (Gly333Gly) changes the score from 0 to 0.59. These three substitutions should clearly be re-evaluated for possible MECP2 mRNA splicing aberrations. Also, we recommend that algorithms used in the analysis of next generation sequence data be updated to predict the effect on splicing machinery at exonic sites away from the known splice donor and acceptor sites.