Mutation spectrum of MLL2 in a cohort of kabuki syndrome patients

Background Kabuki syndrome (Niikawa-Kuroki syndrome) is a rare, multiple congenital anomalies/mental retardation syndrome characterized by a peculiar face, short stature, skeletal, visceral and dermatoglyphic abnormalities, cardiac anomalies, and immunological defects. Recently mutations in the histone methyl transferase MLL2 gene have been identified as its underlying cause. Methods Genomic DNAs were extracted from 62 index patients clinically diagnosed as affected by Kabuki syndrome. Sanger sequencing was performed to analyze the whole coding region of the MLL2 gene including intron-exon junctions. The putative causal and possible functional effect of each nucleotide variant identified was estimated by in silico prediction tools. Results We identified 45 patients with MLL2 nucleotide variants. 38 out of the 42 variants were never described before. Consistently with previous reports, the majority are nonsense or frameshift mutations predicted to generate a truncated polypeptide. We also identified 3 indel, 7 missense and 3 splice site. Conclusions This study emphasizes the relevance of mutational screening of the MLL2 gene among patients diagnosed with Kabuki syndrome. The identification of a large spectrum of MLL2 mutations possibly offers the opportunity to improve the actual knowledge on the clinical basis of this multiple congenital anomalies/mental retardation syndrome, design functional studies to understand the molecular mechanisms underlying this disease, establish genotype-phenotype correlations and improve clinical management.


Background
Kabuki syndrome (KS, MIM #147920), also known as Niikawa-Kuroki syndrome, is a rare, multiple congenital anomalies/mental retardation syndrome characterized by a peculiar face, which is defined by long palpebral fissures with eversion of the lateral third of the lower eyelids, short columella with a broad and depressed nasal tip, prominent ears, and a cleft or high-arched palate. Additional features include short stature, skeletal, visceral and dermatoglyphic abnormalities, cardiac anomalies, and immunological defects [1,2]. Kabuki syndrome has an incidence of 1 in 32,000, likely largely underestimated [3]. The vast majority of reported cases are sporadic. After initial and controversial data that associated this condition to chromosomal rearrangement [4,5], mutations in the MLL2 gene identified the underlying cause of Kabuki syndrome in approximately 72% of affected individuals [6,7]. The encoded MLL2 protein is a member of the Mixed Lineage Leukemia (MLL) family of histone methyl transferases (HMT). The MLL proteins (MLLs) are part of the SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) family of proteins [8]. The highly conserved SET domain of MLLs confers histone methyltransferase activity, which is the core function of HMTs. MLLs are important in the epigenetic control of active chromatin states [9]. They act as transcriptional co-activators and are involved in embryogenesis and development through, for example, regulation of the expression of the HOX genes and their interaction with nuclear receptors [10,11].
The MLL2 gene encodes a multi-domain-containing protein of 5,537 amino acid residues that can methylate the Lys-4 position of histone H3 (H3K4), an epigenetic mark correlated with transcriptional active chromatin [12,13]. MLL2 is involved in estrogen receptor α (ERα)mediated signal transduction, acting as a coactivator of a complex that includes ASH2, RBQ3, and WDR5 [14].
In the present study, by direct sequencing of DNA samples from 62 Kabuki patients we identified 42 MLL2 variants, 38 of which are novel.

Subjects and Clinical Data
Our cohort comprised 62 index patients clinically diagnosed as affected by Kabuki syndrome (Figure 1 and Table 1). Patients were enrolled after obtaining appropriate informed consent by the physicians in charge and approval by the respective local ethics committees. Patients were included in this study whether at least four of the following inclusion criteria were present: 1) long palpebral fissures with eversion of the lateral portion of lower eyelid; 2) broad, arched eyebrows with sparseness; 3) short nasal columella with depressed nasal tip; 4) large, prominent or cupped ears; 5) developmental delay-mental retardation [15].

Samples preparation
Genomic DNAs were extracted from fresh and/or frozen peripheral blood leukocytes of the probands and their available family members using an automated DNA extractor and commercial DNA extraction Kits (EZ1, Qiagen, Hilden, Germany).

PCR-based sequencing of MLL2
Primers were designed using the Primer 3 Output program (http://frodo.wi.mit.edu/primer3/) to amplify the 54 coding exons of MLL2 (RefSeq NM_003482.3) gene including the intronic flanking sequences. Amplicons and primers were checked both by BLAST and BLAT against the human genome to ensure specificity. A complete list of primers is reported in Additional file 1, Table S1. The amplified products were subsequently purified and sequenced with a ready reaction kit (BigDye Terminator v1.1 Cycle, Applied Biosystems). The fragments obtained were purified using DyeEx plates (Qiagen) and resolved on an automated sequencer (3130xl Genetyc analyzer DNA Analyzer, ABI Prism). Sequences were analyzed using the Sequencer software (Gene Codes, Ann Arbor, Michigan). Whenever possible the mutations identified were confirmed on a second independent blood sample. The issue of whether the novel MLL2 missense alterations were causative mutations or neutral polymorphisms was addressed by searching dbSNP (http://www.ncbi.nlm.nih.gov/SNP) for their presence; the screening of 100 alleles from healthy unrelated control subjects and from the 1000 Genomes database [16] were used to assess their presence/absence in the general population. All existing and new mutations were described following the recommendations of the Human Genome Variation Society (http://www.hgvs. org/mutnomen).

Results and Discussion
Exome sequencing recently revealed that mutations in the histone methyltransferase MLL2 gene are a major cause of Kabuki syndrome [6]. In a collaborative effort that involved Italian Institutes, except one, we enrolled 62 individuals with a clinical diagnosis of sporadic Kabuki syndrome (Table 1). We detected nucleotide variants in 73% of the patients (45/62) by direct sequencing of all 54 exons of the MLL2 gene; the vast majority of which are novel (90%, 38/42 different variants) ( Figure   2, Additional file 1, Table S2, Additional file 2, Figure S1 and below) [6,7].

Nonsense and frameshift mutations
In agreement with previous reports we identified a majority of truncating mutations (70%, 29/42), three of  Table S2) [6,7]. Most of the variants are predicted, if translated, to encode shorter MLL2 proteins either by loss of the entire C-terminal region or parts of it ( Figure 2). This region harbors highly conserved domains that are found in a variety of chromatin-associated proteins [17][18][19]: (i) the helical LXXLL regions involved in the recruitment of the MLL2 complex to the promoters of ERα target genes ( Figure 2); (ii) FYRN and FYRC sequence motifs, two poorly characterized phenylalanine/ tyrosine-rich regions of around 50 and 100 amino acids [20], respectively; and (iii) a catalytic SET motif that confers histone methyltransferases activity.
Although it has not been yet experimentally verified for the MLL2 gene, the prevalence of premature termination mutations may result in the partial transcripts degradation through nonsense-mediated mRNA decay (NMD). NMD is an evolutionarily conserved process that typically degrades transcripts containing premature termination codons (PTCs) to prevent translation of unnecessary or aberrant and possible transcripts [21]. The NMD process takes place when PTCs are located more than 50-55 nucleotides upstream of an exon-exon junction [22]. As 86% (25/29) of such detected MLL2 mutations follow this rule it is likely that the consequent MLL2 haploinsufficiency could be the driving force for the onset of the Kabuki syndrome.

Indel variants
Our screen revealed three not yet described indel variants located in the C-terminal region of the protein (see samples KB71, KB77, and KB53 in Additional file 1, Table  S2). They might have resulted from slipped mispairing between direct repeats or through the insertion or deletion of a single base within a mononucleotide tract (Additional file 1, Table S3), as already reported [23].
COILS algorithm predicted the amplification of one of the five coiled-coil putative domains for the c.11819_11836dupTTCAACAACAGCAGCAGC (p. Lys3940_Gln3945dup) (Figure 2 and data not shown), a domain involved in protein-protein interaction. This variant was inherited from the apparently asymptomatic mother. It is thus impossible to conclusively determine the pathogenic nature of the resulting protein.

Splice site variants
We detected 3 variants located at the splice site junctions, two of which are novel [7] (Additional file 1, Table S2); the in silico modeling predicted complete or partial abrogation of the junction formation with a pathogenic impact. The c.400+1G>A, occurring within the invariant GT donor splice site in intron 3-4, results in the disruption of the canonical splice site and it is expected to produce an aberrant protein of only 135 residues. The c.401-3A>G, occurring 3 bp away from the next intron-exon junction, is predicted to create a new acceptor splice site at position -3 within intron 3-4 that could lead to a frameshift encoding a mutant protein with a premature stop at 84 codons downstream. Finally, c.13999+5G>A decreases the donor site score prediction, possibly resulting in a less efficient intron splicing. Unfortunately, RNA from these patients was unavailable preventing further investigation of the effect of these variants.

Missense variants
Missense de novo variants have already been found in Kabuki patients. Ng and colleagues reported 8 pathogenic missense variants, two of which were recurrent in affected patients. As these were all mapping within the last exon of MLL2 that encodes the different conserved C-terminal domains of the protein (see above), the authors suggested that such mutations are tolerated, while mutations elsewhere are lethal. By in silico analysis, Paulussen et al. proposed the pathogenicity of two missense variants located within that C-terminal region [6,7]. We detected seven patients with a single or two missense variants (KB28 and KB38 patients; Additional file 1, Table S2). PCR amplification, cloning and sequencing showed that both sets of the two sequence changes in KB28 and KB38 patients are located on one allele. From a phenotypic point of view, the two patients with pairs of missense variants do not appear to be more severely affected than affected individuals with single variants. Sequencing of the corresponding exons in the KB38 parents demonstrated that both variants arose de novo, while the KB28 patient inherited the variant from the apparently asymptomatic mother. We had also accessed to DNA of parents of carriers of missense variants (Additional file 1, Table S2). Yet, in both cases the MLL2 variant was inherited from the apparently asymptomatic father. The missense variants are distributed across the entire length of the MLL2 gene ( Figure 2). They were not found in 50 healthy unrelated control samples and were absent from the 1000 Genomes database [16]. The putative functional relevance and pathogenicity of these MLL2 missense variants were predicted by in silico software. The PolyPhen program, which predicts possible impact of an amino acid substitution on the structure and function of a human protein, identified only the p.Pro2841Thr variant as possibly damaging. Accordingly to the criteria of Align-GVGD all missense variants were predicted to be deleterious (Table  2). Finally, we used the computational model MutPred, designed to classify an amino acid substitution as disease-associated or neutral in human. MutPred predicted that four of the identified missense variants have a high probability (≥0,5) of being deleterious and generated in silico hypothesis for the possible pathological mechanism for three of them ( Table 2). The analysis of the mutated residues in 7 MLL2 proteins orthologs showed that.all the missense variants occurred at amino acid residues evolutionarily conserved ( Figure 2).   Finally, we employed RESCUE-ESE and Fas-ESS tools on missense variants and frameshift mutations to predict associated splicing phenotypes by identifying sequence changes that disrupt or alter predicted Exonic Splicing Enhancers (ESE) and Exonic Splicing Silencers (ESS). ESE and ESS are short oligonucleotides that can enhance or inhibit pre-mRNA splicing when present in exons, playing important roles in constitutive and alternative splicing. A variation that disrupts an ESE, for instance, could cause exon skipping which would result in the exclusion of an entire exon from the mRNA transcript. Conversely, a substitution in the ESS sequence promotes the use of adjacent splice sites, often contributing to alternative splicing. As reported in Table 2 and Additional file 1, Table S3, we found that some of the MLL2 mutations lead to creation of new ESEs and/or to disruption of predicted wild type ESEs/ESSs. As secondary structures or adjacent negative elements also participate to the modulation of the splicing event mediated by ESE and ESS, we retain that association to functional studies will enable to better understand the role of the reported cases of ESEs/ESSs disruption or alteration in the complex phenotypic spectrum observed in the Kabuki patients.

Conclusions
Our study increases the number of identified MLL2 mutations and variants, and emphasizes some characteristics of the spectrum of MLL2 mutations associated with this pathology, further providing insight into its etiology. The in silico analysis predicts that the identified MLL2 missense, splice-site and indel variants might be pathogenic. Other studies reported the presence of such MLL2 variants predicted to be associated with the disease. However, their biological significance and pathogenicity were not unambiguously demonstrated; therefore further and more functionally oriented studies are needed to understand the nature of these variants and their possible role in the disease. Solving these issues is relevant to avoid any incorrect interpretation and diagnosis of other Kabuki cases carrying such MLL2 variants.
We were unable to found any detectable point mutation and/or small del/dup in the coding region of MLL2 gene in 27% (17/62) of the Kabuki syndrome patients. Mutations in MLL2 regulatory regions, exon microduplications and/or microdeletions, as well as genetic heterogeneity of the syndrome may account for these negative results. Alternatively, some of these patients might have been misdiagnosed as a result of the complex clinical spectrum covered by this pathology, thus possibly highlighting the need to more accurately select Kabuki cases before conducting the analysis.
In summary, this study underlines the relevance of mutational screening of the MLL2 gene among patients with Kabuki syndrome. The identification of a large spectrum of MLL2 mutations will offer the opportunity to improve the actual knowledge on the clinical basis of this multiple congenital mental retardation syndrome, to design functional studies to understand the molecular mechanisms underlying the disease, to establish genotype-phenotype correlations, to improve the clinical management, and to identify potential targets for therapy.

Additional material
Additional file 1: Table S1. Oligos used in this study. Table S2. MLL2 mutations identified in our cohort of KS patients and as reported in the literature. Table S3. Repeats (underlined and highlighted in red) that might mediate micro-deletions, micro-insertion/deletions (indel), and micro-duplications in the MLL2 gene.