Skip to main content

Molecular mechanics and dynamic simulations of well-known Kabuki syndrome-associated KDM6A variants reveal putative mechanisms of dysfunction

Abstract

Background

Kabuki syndrome is a genetic disorder that affects several body systems and presents with variations in symptoms and severity. The syndrome is named for a common phenotype of faces resembling stage makeup used in a Japanese traditional theatrical art named kabuki. The most frequent cause of this syndrome is mutations in the H3K4 family of histone methyltransferases while a smaller percentage results from genetic alterations affecting the histone demethylase, KDM6A. Because of the rare presentation of the latter form of the disease, little is known about how missense changes in the KDM6A protein sequence impact protein function.

Results

In this study, we use molecular mechanic and molecular dynamic simulations to enhance the annotation and mechanistic interpretation of the potential impact of eleven KDM6A missense variants found in Kabuki syndrome patients. These variants (N910S, D980V, S1025G, C1153R, C1153Y, P1195L, L1200F, Q1212R, Q1248R, R1255W, and R1351Q) are predicted to be pathogenic, likely pathogenic or of uncertain significance by sequence-based analysis. Here, we demonstrate, for the first time, that although Kabuki syndrome missense variants are found outside the functionally critical regions, they could affect overall function by significantly disrupting global and local conformation (C1153R, C1153Y, P1195L, L1200F, Q1212R, Q1248R, R1255W and R1351Q), chemical environment (C1153R, C1153Y, P1195L, L1200F, Q1212R, Q1248R, R1255W and R1351Q), and/or molecular dynamics of the catalytic domain (all variants). In addition, our approaches predict that many mutations, in particular C1153R, could allosterically disrupt the key enzymatic interactions of KDM6A.

Conclusions

Our study demonstrates that the KDM6A Kabuki syndrome variants may impair histone demethylase function through various mechanisms that include altered protein integrity, local environment, molecular interactions and protein dynamics. Molecular dynamics simulations of the wild type and the variants are critical to gain a better understanding of molecular dysfunction. This type of comprehensive structure- and MD-based analyses should help develop improved impact scoring systems to interpret the damaging effects of variants in this protein and other related proteins as well as provide detailed mechanistic insight that is not currently predictable from sequence alone.

Background

Our studies are focused on the use and optimization of structural bioinformatics methods for improving the interpretation of data derived from next generation sequencing in cases presenting with rare diseases [1,2,3,4]. Extensive studies in our laboratories have previously demonstrated that these approaches yield not only information on the damaging effects of likely pathogenic variants but also yield useful mechanistic information on their pathophysiological impact on human health. The current study focuses on the analyses of Kabuki-associated histone lysine-specific demethylase 6A, KDM6A, a well-known epigenomic regulator that functions as an eraser of methylated histone marks. KDM6A is encoded by an X-chromosome-linked gene and the protein selectively catalyzes demethylation of tri/di-methylated histone H3K27 residues [5, 6]. There are two main classes of histone demethylases: flavin adenine dinucleotide (FAD)-dependent amine oxidases (e.g. lysine-specific demethylase 1 [LSD1]), and Fe(II) and 2-oxoglutarate (2OG)-dependent hydroxylases, both of which operate by enzymatic oxidation/hydroxylation of a methyl group, followed by non-enzymatic loss of formaldehyde resulting in histone lysine demethylation [7]. The jumonji domain-containing histone demethylase KDM6A (also known as UTX), along with two other related members, KDM6B (JMJD3) and KDM6C (Y-chromosome-linked UTY), belong to the 2OG-dependent dioxygenase superfamily [7]. KDM6A regulates developmentally important genes as demonstrated both in cultured cells using KDM6A knockdown and in vivo using KDM6A-deficient mice [8,9,10,11]. Consequently, all pathogenic KDM6A variants, known to result in loss of function mutations, can disrupt cell, organ, and systems homeostasis, thus impacting human health. Indeed, pathogenic KDM6A variants are one cause of Kabuki syndrome, a rare multi-systemic disorder which presents in approximately one in every 32,000 newborns [12,13,14] and causes developmental defects, disturbed growth, multiple congenital organ malformations, as well as hematological and immunological anomalies [15]. Although the variants here studied associate with the development of Kabuki syndrome, they have never been evaluated for their mechanism of dysfunction. Thus, the well-documented genotype-to-phenotype correlation for these variants justifies the purpose of these studies, namely extending this knowledge toward better understanding their mechanisms of dysfunction at the molecular level with atomic resolution.

A wealth of computational tools has been developed for categorizing potentially damaging effects [16]. For example, the popular genomics-based predictors, such as SIFT [17] and PolyPhen2 [18], suggest which DNA/protein sequence alterations may be damaging to the function. Recently, the use of 3D methods for structural bioinformatics has elicited significant attention due to its potential for additional predictive values and mechanistic inference. This field, while in its infancy, is represented by predictors such as Missense3D [19] and Rhapsody [20], which use protein structures and coarse-grained Elastic Network Model (ENM)-based predictions for more specific mechanistic alterations. In the current study, we report a more comprehensive analytical approach that uses a wider set of scores derived from molecular mechanic calculations and molecular dynamics (MD) simulations using the published 3D structure of the catalytic domain of KDM6A [21]. To this end, we have developed an analytical framework that shows how these 3D parameters can identify dysfunction in a subset of KDM6A Kabuki-associated variants (within its catalytic domain). These multi-parametric analyses include protein folding/stability, structural perturbation, primary motions at the residue and tertiary structural level, time-dependent binding energy calculations, and pKa shift estimations. We demonstrate that these parameters can identify alterations in KDM6A-specific molecular function. Collectively, our data demonstrate that the Kabuki syndrome variants studied here are damaging to this histone demethylase function by various means, thereby extending our understanding of pathobiological mechanisms underlying this disease at the molecular level. Thus, this new knowledge bears both mechanistic and biomedical relevance.

Results

Description of KDM6A enzyme and the dynamic behavior of its catalytic domain

Lysine demethylation involves the iron-dependent oxidation of a methyl group followed by non-enzymatic elimination of formaldehyde (Fig. 1a). KDM6A is made of 1401 amino acids, which separates a linker region that joins two distinctive modular domains; the tetratricopeptide repeat (TPR) domain at the N-terminus and the catalytic domain at the C-terminus (Fig. 1a). The catalytic domain is comprised of a jumonji domain (residues 933–1047 and 1078–1268, slate blue), a helical domain (residues 886–902, 1269–1314, and 1379–1395, magenta), a proline-rich linker (residues 910–932, yellow), and a Zn-binding domain (residues 1315–1378, green) which plays a critical role in selective substrate recognition (Fig. 1b). The jumonji domain contains a jelly-roll fold flanked by α-helices and represents the most evolutionarily conserved region of the protein (Additional file 1: Fig. S1). The 1.8 Å crystal structure of the catalytic domain in complex with a histone H3K27me3 substrate, a nonreactive cofactor analog N-oxalylglycine, a Ni(II) ion at the active site, and the Zn(II) ion coordinated in the Zn-binding domain has been determined and shed light on the molecular basis of its catalysis [21]. Briefly, in its active conformation (H3 peptide-bound), the methylated lysine residue is optimally presented to the active site where the reactive groups are strategically located at specific distances and orientations with respect to the removable methyl groups (Fig. 1c). KDM6A displays maximal activity with the H3K27(me3) which substrate gains a broad ‘W’-shaped conformation during binding (Fig. 1b, d). This bent peptide conformation is stabilized by intra-substrate hydrogen bonds and is required for sequence specificity of KDM6A including sequence-specific ionic interactions such as Glu999-Arg26, Asp1089-Arg26, Glu1326-Arg17, and Glu1335-Lys23 (Fig. 1d). The substrate binding encompasses the buried surface area of 1135 Å2, with a polar surface of 294 Å2 and 836 Å2 nonpolar. In addition, the Zn-binding domain plays a critical role in substrate binding as it goes through a substantial local conformational change when the histone substrate is bound (Additional file 1: Fig. S2). Therefore, any alterations (either direct or indirect) of this critical Zn-coordination site (Fig. 1e) would result in disruption of protein structure and function by negatively affecting the selective substrate recognition and the catalytic activities.

Fig. 1
figure1

Catalytic mechanism, architecture, key functional sites, and disease-causing missense mutations of KMD6A. a Domain structure and the schematic of its catalytic mechanism. Relative positions of KDM6A Kabuki variants identified from the patients are highlighted in red on top of the domain diagram. b The catalytic domain structure of KDM6A in complex with the H3K27me3 peptide, metal ions and the cofactor 2OG (PDB access code 3AVR). The catalytic domain is composed of the jumonji domain flanked by two additional sub-domains and a long flexible linker. The bound substrate is shown as ball-and-sticks while the catalytic domain is shown as ribbons. The color codes are identical to the ones used in Fig. 1A. (ce) Zoomed views of the active site, substrate binding interface, and the zinc ion binding site. two damaging control residues are labeled in red. H3 histone residues are labeled in orange. f Mapping of Kabuki syndrome missense variants onto its molecular structure. None of the Kabuki variants are found right at the critical molecular interaction sites. Figures were made using PyMOL (The PyMOL Molecular Graphics System Version 2.3.0., Schrödinger, LLC)

KDM6A, like other proteins, displays an intrinsic dynamic behavior that is uniquely programmed and optimized for its function. Thus, prior to analyzing the mutational impact of Kabuki variants, we probed the dynamic behaviors of the wild type KDM6A catalytic domain through molecular simulations. The initial 10 ns of MD simulation revealed the functionally mobile regions and the essential dynamic motions (Fig. 2 and Supplementary movie M1). The secondary structural features that compose the overall enzyme–substrate complex displayed a coordinated movement while maintaining the compact arrangement of the subdomains. The structural ensemble during the simulation revealed that the core jumonji domain containing the active site remains relatively still (less than 0.7 Å displacement) while the surrounding regions display more active motions. This dynamic behavior complements and extends previous observations that the catalytic mechanism of KDM6A does not require additional mobile regulatory elements near the active site [7]. The substrate recognition dynamics serve as a rate-limiting step since once the substrate is presented to the active site, the active site catalysis of KDM6A readily takes place. The highly mobile regions (> 1.0 Å displacement) include regions near the surface of the jumonji domain, the substrate recognition, the zinc binding domain, and the helical domain with a potential allosteric regulatory function (Fig. 2a–c). The atomic displacements (fluctuations) of individual residues during MD simulations are in good agreement with the temperature factors of the crystal structure (Fig. 2b, c). This suggests that the simulation captures the most common low energy conformations that represent the native molecular ensemble that links structure-to-function in the analyses of KDM6A. Furthermore, the principal component analysis (PCA) of the structural ensemble reveals high-amplitude concerted motions wherein the substrate binding involves the dominant motions of the first three PCs while the movement of the catalytic core jumonji domain involves limited PC1 and PC2 motions (Fig. 2d). Metal ions and cofactors retain their bound state and ideal geometry with the ligating residues throughout the MD simulation. Together, this data describes for the first time a structure-to-dynamic coupling that characterizes the time-dependent molecular behavior of this Kabuki-associated histone demethylase.

Fig. 2
figure2

Intrinsic mobility of the KDM6A catalytic domain and the essential dynamic motions. a Linear RMSF plot of the KDM6A catalytic domain during MD simulation. The protein regions of greater than 1.0 Å displacements (fluctuations) are indicated by a red dotted line. The secondary structure elements and the domain structures are shown above. b Dynamic fluctuations (RMSF) of each residue during MD simulation are also plotted on the molecular structure surface, which is in good agreement of the temperature factor plot of its crystal structure (c). They are color-coded according the ranges indicated by the horizontal bar at the bottom of each panel. (d) Porcupine plot of trajectories representing the essential dynamic in each principal component during MD simulation. The length of cone represents the motion magnitude and the pointing of the arrow indicates the direction. The catalytic core jumonji domain movements mostly involve PC1 and PC2 while the substrate binding dynamics involve all three main PCs. This plot shows only the enzyme movement

Selection of well-documented, yet rare, Kabuki-associated KDM6A variants for molecular mechanics and MD simulation studies

To test the utility of our biophysically parametrized structural bioinformatic methods [1,2,3,4], we selected KDM6A variants within the catalytic domain found in patients with a Kabuki syndrome diagnosis. These have previously predicted to be damaging in most cases (with a few exceptions) by available impact predictors but for which more studies remain necessary to determine if their classification is valid and to identify the mechanisms underlying their dysfunction (Table 1). Nine missense variants were identified through Human Gene Mutation Database (HGMD) [22] and ClinVar [23] (D980V, S1025G, C1153R, C1153Y, P1195L, L1200F, Q1212R, R1255W, and R1351Q). In addition, two other missense variants, N910S and Q1248R, reported in the literature [13, 24], but not deposited in the databases were added to our study. Figure 1f displays the locations of these well-documented eleven Kabuki syndrome missense variants on the molecular structure. We find that no Kabuki variants directly affect molecular interactions with the key functional/structural elements to its enzymatic activity, such as with the cofactors, substrate, and zinc ligation (Fig. 1c–e). However, these interactions could also be allosterically affected by distant mutations in the 3D structure. We used a non-damaging variant (H1060L) as the main control, because it is annotated as ‘benign’ in ClinVar, annotated as SNP in the Single Nucleotide Polymorphism Database (dbSNP), and displays a higher allele frequency (3.07 × 10–4) in the general population (gnomAD). As additional benign controls, we used two healthy individual variants from the gnomAD database (S912N and T1323A) which have a higher allele frequency (> 2.0 × 10–5). For damaging control variants, we used two non-naturally-occurring variants (H1146A and E1148A) whose deleterious effects and loss-of-function activities are well-documented [25, 26]. We probed the disruptive effect of each mutation by calculating a series of 3D structure-based scores for molecular fitness. These include static structure-based analyses (protein folding energy, protein stability, local conformation, pKa values of the ionizable residues), and molecular dynamic-based analyses (root mean square deviation (RMSD), root mean square fluctuation (RMSF), and time-dependent interaction energy calculations). We also compared the 3D methodology with benchmark variant calling methods, commonly used in the practice, such as PolyPhen2 [18] and Rhapsody [27], chosen here because of their improved algorithms and machine-learning capabilities. The results of the analyses by these methods are shown in Table 2. Notably, we find that the N910S and R1351Q Kabuki variants are predicted to be non-damaging by the sequence-based programs, yet potentially damaging by the static structure- and dynamic-based analyses as described below. Thus, our results demonstrate that, although sequence-based tools are useful and convenient, they are not highly dependable as they lack functional mechanism-based interpretations.

Table 1 Missense Kabuki syndrome and control variants in KDM6A used on our experiments
Table 2 Numerical scores for the Kabuki variants from each sequence, structural and dynamics analysis.

Kabuki-associated KDM6A variants encode proteins that show local structure and stereochemical perturbation and chemical environment alterations

Protein folding and stability are foundations of protein function, and their respective energy calculations have been used to assess the potential impact of mutations [28, 29]. For our study, the stability of the mutated protein was assessed by the variant-induced changes in folding energy (\(\Delta \Delta {\text{G}}\)fold) using FoldX [30] and free energy changes using the Discovery Studio suite (Dassault Systèmes BIOVIA). A high degree of agreement between these two independent estimations is observed (Table 2, columns 4–5). The benign controls (H1060L, S912N, and T1323A) are not affected, and one of the damaging controls (H1146A) is only minimally affected, yet its weakening of Fe(II) coordination and the overall structural drift were detected during MD simulation (Table 2). Overall, our data suggest that all Kabuki variants disrupt the wild type-like structure of the KDM6A protein and likely impair the protein function. To function properly, a protein must maintain its overall structural, local structural–functional, and dynamic features. Any amino substitutions that alter local stereochemical environments such as non-bonding interaction patterns, polarity, and steric configurations can be damaging to protein function. To address the global (entire domain) and local (within a 10 Å radius of the mutated residue) perturbation resulting from a missense variant, we used Missense3D [19] for initial impact analysis and subsequently measured atomic displacements between the wild-type and variants from their energy minimized structures using PyMOL (Molecular Graphics System, Schrödinger, LLC) and Coot [31]. The results of these analyses are shown in Table 2 (columns 6–7). Most Kabuki variants are predicted to cause local and/or global perturbations, much greater than those of the benign controls. However, note that, as exemplified by the N910S and R1351Q, some variants are predicted to be non-damaging by sequence-based programs but show significant local structural perturbations by our analyses. Thus, although global perturbations can be informative, local perturbations seem to be more reliable as damaging effect predictors.

Histone lysine demethylases such as KDM6A are oxidative enzymes whose catalytic activities are very sensitive to local pH and oxygen concentration [32, 33]. In particular, an Fe(II) oxidation sate in the 2OG-dependent dioxygenase superfamily members is critical for substrate methyl oxidation process. Any mutations that disrupt the pKa of protein residues, including the Fe(II) chelating residues, are expected to affect the overall protein function. To investigate how much titratable residues undergo protonation change upon each mutation, we calculated their pKa values using DelPhiPKa [34]. Then, the overall pKa shifts were calculated by summing the pKa differences of each residue between the controls and Kabuki-associated variants in both directions. We calculated pKa shift amounts for the entire catalytic domains to survey the global effects. As shown in Fig. 3a and Table 2 (column 8), the damaging controls (first two lanes of the heatmap and the corresponding values in Table 2) and the majority of Kabuki variants (except the first three, N910S, D980V, and S1025G) caused greater shifts while the benign controls (third to fifth lanes) exhibited minimal shifts in their overall pKa values. Our data also indicate that pKa shifts are not random, but rather strongly coordinated throughout the protein and certain ‘weak-spots’ and more sensitive residues primarily affected by the disease variants (Fig. 3A). These ‘weak-spot’ residues include key functional residues such as the metal Fe(II) ion and cofactor interaction residues (indicated on left in Fig. 3A) as well as several residues within the helical domain that could affect potential allosteric regulations (indicated on right). The benign controls did not cause any notable changes in the pKa values of these ‘weak-spot’ residues. This finding is important, since it indicates that potentially damaging variants can be recognized by this simple parameter. This type of pKa-based analysis can be applied to other relevant proteins such as catalytic enzymes and pH-sensing proteins.

Fig. 3
figure3

Representative impact analyses. a A heatmap representation of the global pKa shifts due to each mutation. pKa shifts of the residues that are directly affected by the substitution (e.g., H1060 of the H1060L mutant or H1146A, E1148A, R1255W, and R1351Q original residues and newly introduced charged residues such as C1135R, Q1212R, and Q1248R) are not shown and only indirect effects are represented. Shift amounts in pH unit are color-coded (indicator bar on right). The vertical axis shows the list of titratable residues in a descending order from the top. Noticeably affected residues are indicated on both sides; key functional residues on left and potential allosteric residues on right. b, c MD-based evaluation of global dynamic alterations, such as b the comparison of RMSD distribution and median values and c the absolute average RMSF differences. Wild type (blue), benign (green), damaging controls (red) and Kabuki variants (orange) are shown. Various comparative values have been tested and ‘Median Difference’ for RMSD and Avg. (|WT-variant|)’ per residue were chosen as the best metrics and plotted for variant impact assessment. All Kabuki variants displayed elevated global motions that deviate from the wild type

Kabuki-Associated KDM6A variants also display altered dynamic behaviors and disrupted functional molecular interactions

We next focused on how variants may affect molecular dynamic-based features of the protein. Consequently, we performed MD simulations to investigate their overall motion and time-dependent molecular interactions as indicators of active site dynamics and substrate binding dynamics. For overall impact scoring, MD trajectory files were analyzed for differences in RMSD and RMSF values. Structural drift (global changes in structure) was measured by monitoring RMSD values from the initial structures as a function of time. We find that the overall structures across variants and replicates were not significantly altered, and the compact conformations were largely preserved in all variants. The majority of variants displayed a median RMSD values around 1.5 Å for all atoms (Additional file 1: Fig. S3). The median difference between each variant and the benign controls were used to assess variant impact (Table 2, column 9). Our results reveal that Kabuki variants show considerable changes in their global dynamics to varying degrees with respect to the wild type, ranging from 8.1 to 16.4% for variants and 2.7 to 6.7% for benign controls; some (such as C1153Y and R1351Q) show even bigger changes than the damaging controls (Fig. 3b). Thus, we believe all Kabuki variants are damaging to protein function as molecular dynamics disruptors.

Additional information was derived at the individual residues level by RMSF calculations, which approximates molecular flexibility. We tested various comparative measurements for developing a RMSF score, such as Spearman correlation coefficient, Pearson correlation coefficient, average difference, average absolute difference, and absolute residual sum, among which average absolute difference was chosen as the best metric for variant impact assessment (Additional file 1: Fig. S4) and shown in Table 2 (column 10). Data distribution is also graphically represented in Fig. 3c. All Kabuki variants also show elevated RMSF values (12 to 19.6% increase compared to 5.7 to 6.0% increase for the benign controls); many (such as N910S, D980V, P1195L, Q1248R, R1255W, and Q1351Q) show even bigger elevations than the damaging controls (Fig. 3c). We also performed principal component analysis (PCA) to quantify differences in essential molecular motions associated with each variant. This was done by displaying the MD structural clusters of each variant onto the bidimensional free-energy landscape (FEL) described by the principal components and measuring the distributional shifts in each direction with respect to the wild type (Additional file 1: Fig. S5). However, the results show that these measurements are not very congruent with other scores, indicating that this analysis might represent a novel approach that needs to be interpreted independently or more effective PC-based measurement schemes need to be devised.

The activity and selectivity of KDM6A is related to the binding affinities of the enzyme to the reactive groups in the active site and to its substrates. Thus, we measured the magnitude of changes in time-dependent interaction energies for key molecular interactions at these sites for each variant using the protocol implemented in the Discovery Studio. For substrate interactions, we calculated the energies for each the peptide and the Lys27me3 residue alone. As shown in Table 2, the metal ion binding disrupting controls (H1146A and E1148A) show destabilization of Fe(II) interactions as predicted, while other variants have minor or even stabilizing effects. Note that the disturbance by the metal-binding controls have even further rippling effects on neighboring key reactive group interactions with cofactors indicate that multiple interactions at the active site could be coupled. More importantly, Table 2 shows that many Kabuki variants (such as D980V, S1025G, C1153R, C1153Y, P1195L, Q1212R, Q1248R, R1255W and R1351Q) have negative effects on specific interactions. The C1153R variant is predicted to affect dynamic interactions with all three key components, namely cofactors, substrate, and Zn(II) ion. These data indicate that although mutations occur on non-interacting residues, the disease variants can have remote effects on the key functional and structural interactions, hinting potential allosteric regulations.

Underlying mutation-specific mechanisms of dysfunction derived from molecular mechanics calculations and dynamics simulations

To provide variant-specific comprehensive interpretations, potential damaging effects of each Kabuki-associated variant and its disease-associations are elaborated herein. The overall impact predictions based on our analyses are summarized in Table 3.

  1. 1.

    KDM6A N910S: The N910 residue is located at the beginning of the flexible linker connecting the helical to the jumonji domain (Fig. 4a). This residue is fully exposed to solvent and a N to S substitution is predicted to have very little damaging effect to the protein by PolyPhen2 and Rhapsody (Table 2). However, our data indicate that this variant can cause local perturbation (0.2238 Å on average within 10 Å radius) and a drift in molecular dynamics. This variant is unlikely to impact protein folding/stability nor molecular interactions used for catalysis (Table 2). This variant is inherited and the patient exhibits classical Kabuki syndrome facial features and intellectual disability.

  2. 2.

    KDM6A D980V: The D980 residue is located at the flaking region of jelly-roll fold of the jumonji domain (Fig. 4b) and leads to the formation of hydrogen bonds with the backbone amide nitrogen atoms of L981 and G982. Our data predicts that a D to V substitution causes a loss of these hydrogen bonds thereby disturbing the local protein-solvent interface as indicated by considerable differences in RMSD and RMSF values, thus acting as a molecular dynamics disruptor (Table 2 and Fig. 3b, c). The variant shows little changes in folding energy and stability (0.655 and − 0.62 kcal/mol, respectively), but is expected to remotely affect the specific interaction between KDM6A and the Lys27Me group of the substrate (Table 2).

  3. 3.

    KDM6A S1025G: This amino acid substitution is located near the jelly-roll fold of the jumonji domain and the H3 substrate binding interface (Fig. 4c). In the variant, glycine forms a single hydrogen bond with the neighboring Q1003 but, besides this small change, other structure-based parameters predict low damaging effects. However, MD simulation analysis indicates that protein dynamics can be disrupted with a weakened substrate recognition (Table 2).

  4. 4.

    KDM6A C1153R: This residue is located within the core jelly-roll fold of the jumonji domain near the active site (Fig. 4d). The surrounding local structure is tightly packed and the C to R substitution introduces a buried charged residue that significantly disrupts local conformation, including at the active site (Table 2 for protein stability, local perturbation, and molecular dynamics). As a result, key functional interactions such as 2OG and substrate interactions are disrupted, and the zinc-ligation is also affected (Table 2).

  5. 5.

    KDM6A C1153Y: This variant can perturb the local structure leading to a large shift in folding energy/stability and pKa as well as affect global dynamics parameters with the disruption of protein-substrate interactions (Table 2). This variant has also been found as a somatic mutation in colorectal carcinoma (COSMIC ID: COSM6692688) [35, 36].

  6. 6.

    KDM6A P1195L: This variant, which has also been found as a somatic mutation in esophageal adenocarcinoma (COSMIC ID: COSM1255502) [37], is located within the hydrophobic core of the jumonji domain at the beginning of an α-helical segment (Fig. 4e). Although PolyPhen2 predicted a low damaging effect (Table 2), a P to L substitution at this site is predicted to alter the backbone geometry (disallowed phi/psi) and cause significant steric hindrance. In addition, our data support predicts the structure-to-functional disruption consistent with various biochemical parameters (folding/stability, local perturbation, pKa shift, and substrate recognition) as well as alterations in its molecular dynamics (Table 2 and Fig. 3b, c).

  7. 7.

    KDM6A L1200F: The affected residue is located within the hydrophobic core of the jumonji domain which provides a proper microenvironment but the increased size of this substitution (Fig. 4f) likely causes local steric hindrance and thereby disturbing local structure, folding/stability energies, and pKa shift (Table 2).

  8. 8.

    KDM6A Q1212R: This variant locates within the core jelly-roll fold of the jumonji domain near the active site and forms a hydrogen bond network with neighboring residues, N1156 and W1166 (Fig. 4g). Although a Q to R substitution retains some of the wild type interactions, the positive charge and increased size of arginine are predicted to perturb local structure and protein folding/stability energies (Table 2). Notably, this variant shows the largest shifts in folding energy, global perturbation, and pKa values (Table 2 and Fig. 3a). Additional evidence suggests that this amino acid substitution can also impact substrate recognition (Table 2).

  9. 9.

    KDM6A Q1248R: This amino acid change occurs at a conserved region near the jumonji domain and leads to a gain in a pair of hydrogen bonds with the backbone carbonyl oxygen and amino nitrogen (Fig. 4H). Like Q1212R, this substitution could perturb local structure and greatly reduce protein stability (Table 2). This variant is also expected to cause a considerable structural drift and weakened substrate interactions (Table 2 and Fig. 3b).

  10. 10.

    KDM6A R1255W: This variant has also been found as a somatic mutation in a number of different cancer types (COSMIC ID: COSM212434) [38, 39]. This amino acid change falls within a critical interface between the jelly-roll fold and the adjacent helix (highly conserved and less mobile element) within the jumonji domain (Fig. 4i). A drastic R to W substitution is predicted to cause multiple consequences including local/global perturbation, pKa shift, and structural drift by altering the chemical/spatial environment as well as disruption of a hydrogen bond with Q1147 (Table 2). Furthermore, this variant may perturb the 2OG cofactor interaction (Table 2).

  11. 11.

    KDM6A R1351Q: This variant has also been found as a somatic mutation in neurofibromatosis and caecum carcinoma (COSMIC ID: COSM6952414) [39, 40]. This residue is located within a zinc-binding domain (highly mobile region) and the main chain and side chain are fully exposed to solvent (Fig. 4j). Although this variant is predicted to have minimal impact by both PolyPhen2 and Rhapsody (Table 2). However, this result is confounded by the fact that the R1351 residue is not strictly conserved and the same glutamine residue is found at this position in KDM6C (Additional file 1: Fig. S6), rendering this algorithm of low accuracy. Notably, however, we find that this variant displays the largest amount of local perturbation and structural drift from the starting model during MD simulation (Table 2). Thus, this substitution likely has a damaging effect on these proteins and may be in part responsible for low enzymatic activity of KDM6C [41].

Table 3 Overall impact prediction of Kabuki variants derived from the assessment at each layer. Impact assignment of the variants at each layer are tentatively given by referring to the control values. These data illustrate the multiple pathways these variants may be affecting KDM6A function
Fig. 4
figure4

Close-up views of Kabuki variants under study and their category-wise assessment. Further descriptions on the alterations of local environment and chemical interactions by these mutations are provided in the main text. At the bottom of each panel, colored boxes represent overall assessment of each category for each Kabuki syndrome variant; sequence-, structure-, and MD-based, respectively from left to right. The intensity of redness indicates the level of damaging impacts at each layer light to dark. Although N910S and R1351Q are predicted to be non-damaging by sequence-based analysis, they are predicted to be damaging by static structure- and dynamics-based mechanistic analyses. There is a set of indicators for C1153 representing C1153R (top) and C1153Y Kabuki variants

Discussion

In this study, we applied comprehensive sequence-, structure-, dynamic-based approaches to study the potential mechanisms by which disease-associated missense variants may affect KDM6A enzymatic activities. Eleven Kabuki syndrome missense variants within the catalytic domain, along with five selected controls, were chosen and characterized for structure- and dynamics-based impact prediction to gain molecular insight into how these genetic variations may affect KDM6A protein function. Among the set of variants, we have shown many types of mechanistic disruptions in folding, global and local structural perturbation, pKa shifts and interactions with the histone H3, cofactors and metal ions that would contribute to a damaging effect on the protein.

Although all Kabuki syndrome missense variants are found in non-critical functional regions, our data indicate that they all significantly disrupt the global and local conformation, chemical environment, and/or molecular dynamics, and some remotely affect the critical molecular interactions. For the benign control variants (H1060L, S912N, and T1323A), all the measures consistently support that these substitutions can be tolerated while evident damaging effects for H1146A and E1148A were clearly demonstrated. Three variants identified in Kabuki Syndrome patients, N910S, P1195L and R1351Q, were not predicted to be damaging by at least one sequence-based score. However, by considering the structural and dynamics-based scores, we have identified many mechanisms by which these Kabuki syndrome variants may damage the function of KDM6A. For example, structural score FoldX shows that P1195L is destabilized relative to the wild-type protein, while P1195L and R1351Q have considerable pKa shifts, and all three variants have elevated global and local perturbations. When using dynamics-based scores, all three variants exhibited potentially damaging changes through increased atomic displacement and destabilization of the protein. Although all three show only subtle disruption of active site key interactions, P1195L and R1351Q show significantly destabilized substrate interactions. These data indicate that structural and dynamics-based analyses are essential for more accurate impact assessment. All Kabuki syndrome variants are damaging to protein function as they all prove to be molecular dynamics disruptors (MD variants) and the majority also affect the structural aspects of the protein (structural variants) that are directly related to protein’s integrity and critical function. Although all Kabuki variants are located at lesser mobile regions (< 0.7 Å displacement), except R1351Q (1.21 Å displacement), their mutations are expected to greatly affect protein’s functional dynamics and overall activities.

Kabuki syndrome is a rare disease which occurs about once in every 32,000 births and KDM6A mutations account for only ~ 5% of all cases. Because case reports with KDM6A missense mutations are very limited, we cannot make any conclusive and scientifically sound statements on the prevalence of the mutations. Moreover, most mutations are truncation/deletion/nonsense/frame shift or regulatory mutations such as splice site and non-coding DNAs. D980V was the only one reported more than once. On the other hand, there are plenty of KDM6A cancer somatic missense variants that are found in the database and four of the Kabuki-associated variants (C1153Y, L1200F, R1255W, and R1351Q) have been also identified from cancer patients. Although C1153Y, L1200F, and R1351Q have been reported only once, the most ‘damaging’ variant R1255W (Tables 2, 3) has been reported in an exceptionally high number of 12 times from various cancer types such as stomach, prostate, and uterus cancers. This might suggest that more damaging variants can be more frequently found in human diseases. However, the prevalence of Kabuki germline variants cannot be currently addressed with the limited information.

Various metrics used to analyze MD trajectories can elucidate different properties of protein macromolecules, such as protein stability, energy distribution, and functional flexibilities. For the current study, we used RMSD, RMSF, and time-dependent molecular interactions as reliable MD-based metrics that are effective in quantifying alterations of dynamic properties associated with each variant. Additional structure-related metrics such as protein surface features [20], four-body contact potential [42] and local energetic frustration [43] can be further used for more in-depth analysis in a protein-specific manner. We also found that, among the structure-based metrics, pKa shift analysis was particularly useful for KDM6A and should be applicable to other related proteins.

Some KDM6A variants may additionally affect other aspects of protein life cycle and protein function such as gene expression, RNA dynamics, protein translocation, protein–protein interactions, and post-translational modification. In particular, surface residues such as N901 and R1351 could affect protein–protein or protein-DNA interactions as this protein is present in a complex that contains additional histone modifying enzymes such as CBP/p300, chromatin remodeling SWI/SNF proteins, and KMT2C/D and may contact DNAs that are associated with nucleosome [36, 44]. KDM6A also interacts with mixed-lineage leukemia complex 3 (MLL3) and MLL4 [45]. In fact, impaired H3K4 methyltransferase activity of KMT2D variants, another Kabuki syndrome associated gene product, is in part due to disruption of WRAD complex formation, which further associates and stimulates the COMPASS catalytic activity [24]. Thus, we cannot rule out the possibility that KDM6A Kabuki syndrome variants have multifarious damaging effects beyond what we present in the current study.

Conclusions

Overall, our study provided a molecular insight into how KDM6A disease variants might disrupt the function. These data reinforced the fact that Kabuki variants are loss-of-function mutations, and protein structure and dynamics are essential elements for protein’s optimal function and normal physiology. Our data clearly indicate that the comprehensive assessment, including molecular dynamics, is superior to the present annotation tools used in human genomics databases. Our comprehensive approaches add another layer to the overall damaging impact interpretation that is more directly related to protein’s molecular function. We believe that MD simulations can become an integral part of meta-prediction approaches which can improve the accuracy and sensitivity of damaging effect prediction for interpretation of genomic data derived from next generation sequencing. The time-dependent, three-dimensional dynamic behaviors should add value to sequence- and static structure-based analyses and allows more detailed inference and mechanistic predictions to be made. The widespread adoption of these methods can provide better diagnosis, risk assessment, and clinical guidelines for the observed variants within the context of individualized medicine.

Methods

Preparation of the initial structure

High resolution (1.8 Å) crystal structure of a histone H3K27me3 peptide [17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33]-bound form of human KDM6A catalytic domains (PDB access code 3AVR) was used in our study. This structure contains a Ni(II) ion (instead of the enzymatic Fe(II) ion) and the cofactor analog N-oxalylglycine at the active site as well as a structural zinc ion in the zinc binding domain. However, for our studies, the cofactor analog and the Ni(II) ion were replaced with the natural cofactor 2-oxoglutarate (2OG, also known as α-ketoglutarate) and the Fe(II) ion, respectively, to reconstruct the native-like active conformation. In addition, this structure is missing two loop regions (amino acids 902–910 and 1047–1078) due to high mobility, and they were filled in for our analysis using the Modeller program [46]. For missense variant analysis, substitutions were made within the Discovery Studio suite version 19.1 (Dassault Systèmes BIOVIA) by mutating the corresponding residue and selecting the side chain rotamer causing the least steric hindrance with the surrounding residues.

Protein folding energy and stability calculation

Mutagenesis and structure-based predictions of changes in folding energy (\(\Delta \Delta {\text{G}}\)fold) were computed using FoldX [30]. In addition, shifted amounts of protein stability (free-energy difference between folded and unfolded states) due to mutations were calculated at pH 7.4 within the Discovery Studio using the energy-minimized wild type structure (H3-unbound form) and introducing each substitution for calculation. After the preparation phase, the initial structures of the wild type and the generated mutants were subjected to a two-stage minimization process before subjected for energy calculation. Proteins vary in stability, but generally a ΔΔG in the range of 2 kcal/mol is considered to result in a mutational ‘hot-spot’ of sufficient effect [47].

Local structure perturbation measurement

From the energy-minimized structures, any residues that reside within 10 Å radius from the mutation site were selected and calculated for the RMSD of the backbone atoms between the wild type and the mutant using PyMOL (Molecular Graphics System, Schrödinger, LLC) and Coot [31]. For global structure perturbation, entire backbone atoms were used for RMSD calculation between the structures.

pK a shift analysis

To investigate the possibility that some titratable residues may undergo protonation change upon single amino acid substitution of KDM6A, we performed the pKa calculations at pH 7.0 with DelPhiPKa [34] which is a surface‐free Poisson‐Boltzmann based approach to calculate the pKa values of protein ionizable residues. We first calculated the pKa values of titratable residues for the wild type and each mutant using energy-minimized structures. The pKa shifts then were calculated by subtracting the pKa values of the wild type and the mutant residues in both directions and summing up the differences. Loss or gain of titratable residues at the position where Kabuki mutations occur and could dominate the cumulative pKa shift amount were not considered in calculations. The R studio code used to generate the pKa shift heatmap is provided as a supplementary material (Additional file 1: Text S1).

Molecular simulations

MD simulations were performed using the CHARMm36 all-atom-force-field [36] implemented in the Discovery Studio with a 2 fs time step. A simplified distance-dependent implicit solvent environment was used with a dielectric constant of 80 and a pH of 7.4, and no further parameterization of a non-standard residue (K27me3), cofactor and metal centers. All MD simulations were carried out using periodic boundary conditions. Models were energy minimized for 5,000 steps using steepest decent followed by 5,000 steps of conjugate gradient to relax the protein structure that was obtained under the stressed crystal environment. Each system of 10 replicates of wild type and each variant was independently heated to 300 K over 200 ps and equilibrated for 500 ps followed by 10 ns production simulation under NPT ensemble by changing the initial seed (100 ns total). Structures during unconstrained dynamics simulation were recorded every 10 ps to give a total of 1000 frames for analyses. For final data analysis, one or two outliers (in some cases none) from each data set of 10 replicates that clearly deviate from the rest in RMSD plots and might represent the minor and rarer form of conformations (altogether 12% of the entire data) were excluded from averaging, and only the last 500 frames that have reached the near minimum total energy state were used. For RMSD/RSMF analysis, all MD trajectories were processed using the tools available within the Discovery Studio and the algorithms implemented in Microsoft Excel program. Different comparative values between the plots were tested and ‘Median Difference’ for RMSD and Avg. (|WT-variant|)’ per residue were chosen as the best metrics for variant impact assessment. The R studio code used to generate the RMSD violin plot is provided as a supplementary material (Additional file 1: Text S1). For PCA, the Bio3D program [48] and the in-house workflow were harnessed. For a porcupine plot representation (Fig. 2D), we derived a matrix Λ by diagonalizing Ci,j with a transformation matrix T, so that Λ = TTCT. The columns of T are then the eigenvectors vi of the motion, with the first column being the most significant motion (PC1) and the second (PC2) and the third (PC3), and the diagonal elements of Λ are the eigenvalues of the decomposition. A cone drawn from the atomic coordinate of an atom (later combined and averaged for the entire amino acid), with height and direction derived from the components of v1 that relate to that atom, gives a graphical representation of the motion held in v1, the first eigenvector, v2, and v3.

Time-dependent interaction energy calculation

Molecular interaction free energies were measured using the protocol implemented in the Discovery Studio. This was done using the MD simulation trajectories. Non-bonded interactions were monitored and dynamic interaction energies (van der Waals and electrostatic energies) were calculated from using the CHARMm36 force field and the implicit distance-dependent dielectric solvent model.

Overall impact classification of the variant

In Table 3, we tentatively label ‘benign’, ‘VUS’ (variant of uncertain significance), or ‘damaging’ simply by referring to the control values. Anything below the highest value of the benign controls would be considered as ‘benign’ in each category. Anything under the twice value would be considered as ‘uncertain’ and anything higher than that would be ‘damaging’. Also, for each layer (structure, pKa shift, dynamics, or time-dependent molecular interactions), there are more than one metrics (except pKa shift) and they all need to be considered for collective labeling. Any variants that are predicted to be ‘damaging’ at any layers would be ultimately classified as ‘damaging’. There have been pathogenicity thresholds or mutation significance cutoffs suggested by each predictor or the combination of several predictors; however, they still have a high false positive/negative rate. More quantitative overall scoring schemes or a machine learning model will be considered when we have enough training sets for KDM6A and when we are ready to validate the results.

Availability of data and materials

MD Simulation data will be available upon request.

Abbreviations

2OG:

2-Oxoglutarate

COSMIC:

Catalog of somatic mutations in cancer

dbSNP:

Nucleotide Polymorphism Database

ENM:

Elastic Network Model

FAD:

Flavin adenine dinucleotide

FEL:

Free-energy landscape

HGMD:

Human Gene Mutation Database

HMT:

Histone methyltransferase

IBEM:

Inborn error of metabolism

KDM6A:

Histone lysine(K)-specific demethylase 6A

MD:

Molecular dynamics

PCA:

Principal component analysis

PDB:

Protein data bank

RMSD:

Root mean square deviation

RMSF:

Root mean square fluctuation

SNP:

Single-nucleotide polymorphism

TPR:

Tetratricopeptide repeat

VUS:

Variant of uncertain (unknown) significance

References

  1. 1.

    Zimmermann MT, Urrutia R, Cousin MA, Oliver GR, Klee EW. Assessing human genetic variations in glucose transporter SLC2A10 and THEIR ROLE IN ALTERING STRUCTURAL AND FUNCTIONAL PROPERTIES. Front Genet. 2018;9:276.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  2. 2.

    Gupta A, Dsouza NR, Zarate YA, Lombardo R, Hopkin R, Linehan AR, et al. Genetic variants in DGAT1 cause diverse clinical presentations of malnutrition through a specific molecular mechanism. Eur J Med Genet. 2020;63(4):103817.

    PubMed  Article  PubMed Central  Google Scholar 

  3. 3.

    Kaiwar C, Zimmermann MT, Ferber MJ, Niu Z, Urrutia RA, Klee EW, et al. Novel NR2F1 variants likely disrupt DNA binding: molecular modeling in two cases, review of published cases, genotype-phenotype correlation, and phenotypic expansion of the Bosch-Boonstra-Schaaf optic atrophy syndrome. Cold Spring Harb Mol Case Stud. 2017;3(6):a002162.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  4. 4.

    Cousin MA, Zimmermann MT, Mathison AJ, Blackburn PR, Boczek NJ, Oliver GR, et al. Functional validation reveals the novel missense V419L variant in TGFBR2 associated with Loeys-Dietz syndrome (LDS) impairs canonical TGF-beta signaling. Cold Spring Harb Mol Case Stud. 2017;3(4):a001727.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  5. 5.

    Swigut T, Wysocka J. H3K27 demethylases, at long last. Cell. 2007;131(1):29–32.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  6. 6.

    Gazova I, Lengeling A, Summers KM. Lysine demethylases KDM6A and UTY: The X and Y of histone demethylation. Mol Genet Metab. 2019;127(1):31–44.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  7. 7.

    Islam MS, Leissing TM, Chowdhury R, Hopkinson RJ, Schofield CJ. 2-oxoglutarate-dependent oxygenases. Annu Rev Biochem. 2018;87:585–620.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  8. 8.

    Lee S, Lee JW, Lee SK. UTX, a histone H3-lysine 27 demethylase, acts as a critical switch to activate the cardiac developmental program. Dev Cell. 2012;22(1):25–37.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  9. 9.

    Hemming S, Cakouros D, Isenmann S, Cooper L, Menicanin D, Zannettino A, et al. EZH2 and KDM6A act as an epigenetic switch to regulate mesenchymal stem cell lineage specification. Stem Cells. 2014;32(3):802–15.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  10. 10.

    Welstead GG, Creyghton MP, Bilodeau S, Cheng AW, Markoulaki S, Young RA, et al. X-linked H3K27me3 demethylase Utx is required for embryonic development in a sex-specific manner. Proc Natl Acad Sci U S A. 2012;109(32):13004–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  11. 11.

    Thieme S, Gyarfas T, Richter C, Ozhan G, Fu J, Alexopoulou D, et al. The histone demethylase UTX regulates stem cell migration and hematopoiesis. Blood. 2013;121(13):2462–73.

    CAS  PubMed  Article  Google Scholar 

  12. 12.

    Banka S, Lederer D, Benoit V, Jenkins E, Howard E, Bunstone S, et al. Novel KDM6A (UTX) mutations and a clinical and molecular review of the X-linked Kabuki syndrome (KS2). Clin Genet. 2015;87(3):252–8.

    CAS  PubMed  Article  Google Scholar 

  13. 13.

    Bogershausen N, Gatinois V, Riehmer V, Kayserili H, Becker J, Thoenes M, et al. Mutation Update for Kabuki Syndrome Genes KMT2D and KDM6A and further delineation of X-linked kabuki syndrome subtype 2. Hum Mutat. 2016;37(9):847–64.

    PubMed  Article  CAS  Google Scholar 

  14. 14.

    Miyake N, Koshimizu E, Okamoto N, Mizuno S, Ogata T, Nagai T, et al. MLL2 and KDM6A mutations in patients with Kabuki syndrome. Am J Med Genet A. 2013;161A(9):2234–43.

    PubMed  Article  CAS  Google Scholar 

  15. 15.

    Stagi S, Gulino AV, Lapi E, Rigante D. Epigenetic control of the immune system: a lesson from Kabuki syndrome. Immunol Res. 2016;64(2):345–59.

    CAS  PubMed  Article  Google Scholar 

  16. 16.

    Hu Z, Yu C, Furutsuki M, Andreoletti G, Ly M, Hoskins R, et al. VIPdb, a genetic variant impact predictor database. Hum Mutat. 2019;40(9):1202–14.

    PubMed  PubMed Central  Article  Google Scholar 

  17. 17.

    Vaser R, Adusumalli S, Leng SN, Sikic M, Ng PC. SIFT missense predictions for genomes. Nat Protoc. 2016;11(1):1–9.

    CAS  PubMed  Article  Google Scholar 

  18. 18.

    Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. 19.

    Ittisoponpisan S, Islam SA, Khanna T, Alhuzimi E, David A, Sternberg MJE. Can predicted protein 3D structures provide reliable insights into whether missense variants are disease associated? J Mol Biol. 2019;431(11):2197–212.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20.

    Ponzoni L, Bahar I. Structural dynamics is a determinant of the functional significance of missense variants. Proc Natl Acad Sci U S A. 2018;115(16):4164–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. 21.

    Sengoku T, Yokoyama S. Structural basis for histone H3 Lys 27 demethylation by UTX/KDM6A. Genes Dev. 2011;25(21):2266–77.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NS, et al. Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat. 2003;21(6):577–81.

    CAS  PubMed  Article  Google Scholar 

  23. 23.

    Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42(Database issue):D980–5.

  24. 24.

    Cocciadiferro D, Augello B, De Nittis P, Zhang J, Mandriani B, Malerba N, et al. Dissecting KMT2D missense mutations in Kabuki syndrome patients. Hum Mol Genet. 2018;27(21):3651–68.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. 25.

    Lee MG, Villa R, Trojer P, Norman J, Yan KP, Reinberg D, et al. Demethylation of H3K27 regulates polycomb recruitment and H2A ubiquitination. Science. 2007;318(5849):447–50.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  26. 26.

    Hong S, Cho YW, Yu LR, Yu H, Veenstra TD, Ge K. Identification of JmjC domain-containing UTX and JMJD3 as histone H3 lysine 27 demethylases. Proc Natl Acad Sci U S A. 2007;104(47):18439–44.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Ponzoni L, Penaherrera DA, Oltvai ZN, Bahar I. Rhapsody: Predicting the pathogenicity of human missense variants. Bioinformatics. 2020.

  28. 28.

    Ancien F, Pucci F, Godfroid M, Rooman M. Prediction and interpretation of deleterious coding variants in terms of protein structural stability. Sci Rep. 2018;8(1):4480.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  29. 29.

    Strokach A, Corbi-Verge C, Teyra J, Kim PM. Predicting the effect of mutations on protein folding and protein-protein interactions. Methods Mol Biol. 2019;1851:1–17.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  30. 30.

    Guerois R, Nielsen JE, Serrano L. Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol. 2002;320(2):369–87.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  31. 31.

    Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60(Pt 12 Pt 1):2126–32.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  32. 32.

    Gaweska H, Henderson Pozzi M, Schmidt DM, McCafferty DG, Fitzpatrick PF. Use of pH and kinetic isotope effects to establish chemistry as rate-limiting in oxidation of a peptide substrate by LSD1. Biochemistry. 2009;48(23):5440–5.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. 33.

    Chakraborty AA, Laukka T, Myllykoski M, Ringel AE, Booker MA, Tolstorukov MY, et al. Histone demethylase KDM6A directly senses oxygen to control chromatin and cell fate. Science. 2019;363(6432):1217–22.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Wang L, Zhang M, Alexov E. DelPhiPKa web server: predicting pKa of proteins. RNAs and DNAs Bioinf. 2016;32(4):614–5.

    Article  CAS  Google Scholar 

  35. 35.

    Wang L, Shilatifard A. UTX mutations in human cancer. Cancer Cell. 2019;35(2):168–76.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  36. 36.

    Schulz WA, Lang A, Koch J, Greife A. The histone demethylase UTX/KDM6A in cancer: Progress and puzzles. Int J Cancer. 2019;145(3):614–20.

    CAS  PubMed  Article  Google Scholar 

  37. 37.

    Dulak AM, Stojanov P, Peng S, Lawrence MS, Fox C, Stewart C, et al. Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity. Nat Genet. 2013;45(5):478–86.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  38. 38.

    Shah SP, Roth A, Goya R, Oloumi A, Ha G, Zhao Y, et al. The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature. 2012;486(7403):395–9.

    CAS  PubMed  Article  Google Scholar 

  39. 39.

    Zehir A, Benayed R, Shah RH, Syed A, Middha S, Kim HR, et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat Med. 2017;23(6):703–13.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  40. 40.

    Palsgrove DN, Brosnan-Cashman JA, Giannini C, Raghunathan A, Jentoft M, Bettegowda C, et al. Subependymal giant cell astrocytoma-like astrocytoma: a neoplasm with a distinct phenotype and frequent neurofibromatosis type-1-association. Mod Pathol. 2018;31(12):1787–800.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. 41.

    Walport LJ, Hopkinson RJ, Vollmar M, Madden SK, Gileadi C, Oppermann U, et al. Human UTY(KDM6C) is a male-specific N-methyl lysyl demethylase. J Biol Chem. 2014;289(26):18302–13.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  42. 42.

    Masso M. All-atom four-body knowledge-based statistical potentials to distinguish native protein structures from nonnative folds. Biomed Res Int. 2017;2017:5760612.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  43. 43.

    Jenik M, Parra RG, Radusky LG, Turjanski A, Wolynes PG, Ferreiro DU. Protein frustratometer: a tool to localize energetic frustration in protein molecules. Nucleic Acids Res. 2012;40(Web Server issue):W348–51.

  44. 44.

    Seton-Rogers S. Pancreatic cancer: the COMPASS shows the way. Nat Rev Cancer. 2018;18(5):373.

    CAS  PubMed  Article  Google Scholar 

  45. 45.

    Cho YW, Hong T, Hong S, Guo H, Yu H, Kim D, et al. PTIP associates with MLL3- and MLL4-containing histone H3 lysine 4 methyltransferase complex. J Biol Chem. 2007;282(28):20395–406.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Webb B, Sali A. Protein structure modeling with MODELLER. Methods Mol Biol. 2017;1654:39–54.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  47. 47.

    Tang H, Thomas PD. Tools for predicting the functional impact of nonsynonymous genetic variation. Genetics. 2016;203(2):635–47.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Grant BJ, Rodrigues AP, ElSawy KM, McCammon JA, Caves LS. Bio3d: an R package for the comparative analysis of protein structures. Bioinformatics. 2006;22(21):2695–6.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work has been made possible through the generous support of Harmony 4 Hope, a 501c3 charitable organization. The authors and members of Harmony 4 Hope are grateful to all patient heroes who inspire our personal and scientific lives and our continued commitment to further the research of rare diseases.

Funding

Additional support was provided by the Advancing a Healthier Wisconsin Endowment to the Precision Medicine Simulation Unit of the Genomic Sciences and Precision Medicine Center at the Medical College of Wisconsin (to RU). This work was also in part supported by the NIH National Intitute of Diabetes and Digestive and Kidney Diseases Grant R01DK052913 (to RU and GL) and National Institute of General Medical Sciences Grant R35GM128840 (to BCS).

Author information

Affiliations

Authors

Contributions

RU devised the project, the main conceptual ideas, and proof outline. RU, YC, and TJS designed the computational framework and analyzed the data. MZ developed the in-house workflow for MD simulation data analysis including PCA. TMD, ENL, ST, NRD, AJM, BFV, BCS, DGB, GL, and MZ discussed the results and provided critical feedback and helped shape the research and analysis. YC, TJS, and RU were major contributors in writing the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Raul Urrutia.

Ethics declarations

Ethics approval and consent to participate

No human participants or human tissues involved.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

. Supplementary information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chi, YI., Stodola, T.J., De Assuncao, T.M. et al. Molecular mechanics and dynamic simulations of well-known Kabuki syndrome-associated KDM6A variants reveal putative mechanisms of dysfunction. Orphanet J Rare Dis 16, 66 (2021). https://doi.org/10.1186/s13023-021-01692-w

Download citation

Keywords

  • KDM6A
  • Epigenetic regulators
  • Histone demethylase
  • Genomic variation
  • Kabuki syndrome
  • Mutational impact analysis
  • Protein structure
  • Molecular dynamics