Novel GANAB variants associated with polycystic liver disease

Background Polycystic liver disease (PLD) is an inherited disorder characterized by numerous cysts in the liver. Autosomal dominant polycystic kidney and liver disease (ADPKD and ADPLD, respectively) have been linked to pathogenic GANAB variants. GANAB encodes the α-subunit of glucosidase II (GIIα). Here, we report the identification of novel GANAB variants in an international cohort of patients with the primary phenotype of PLD using molecular inversion probe analysis. Results Five novel GANAB variants were identified in a cohort of 625 patients with ADPKD or ADPLD. In silico analysis revealed that these variants are likely to affect functionally important domains of glucosidase II α-subunit. Missense variant c.1835G>C p.(Arg612Pro) was predicted to disrupt the structure of the active site of the protein, likely reducing its activity. Frameshift variant c.687delT p.(Asp229Glufs*60) introduces a premature termination codon predicted to have no activity. Two nonsense variants (c.2509C>T; p.(Arg837*), and c.2656C>T; p.(Arg886*)) and splice variant c.2002+1G>C, which causes aberrant pre-mRNA splicing and affecting RNA processing, result in truncated proteins and are predicted to cause abnormal binding of α- and β-subunits of glucosidase II, thus affecting its enzymatic activity. Analysis of glucosidase II subunits in cell lines shows expression of a truncated GIIα protein in cells with c.687delT, c.2509C>T, c.2656C>T, and c.2002+1G>C variants. Incomplete colocalization of the subunits was present in cells with c.687delT or c.2002+1G>C variants. Other variants showed normal distribution of GIIα protein. Conclusions We identified five novel GANAB variants associated with PLD in both ADPKD and ADPLD patients supporting a common pathway in cystogenesis. These variants may lead to decreased or complete loss of enzymatic activity of glucosidase II which makes GANAB a candidate gene to be screened in patients with an unknown genetic background.

disorders, they share PLD as a major phenotypic feature. Recent evidence suggests that the involved proteins interact, with decreased levels of functional polycystin-1 (PC1), encoded by PKD1, as the central element for cyst development [6,7].
With the exception of LRP5 (a transmembrane protein part of the LRP5/LRP6/Frizzled co-receptor complex [8,9]) all genes involved in ADPLD encode proteins involved in endoplasmic reticulum trafficking and quality control of glycoproteins. These genes include PRKCSH, SEC63, GANAB, ALG8, SEC61B, DNAJB11, and ALG9 [10]. Central to the pathomechanism of ADPLD is glucosidase II, an endoplasmic reticulum (ER) resident N-linked glycanprocessing enzyme. Glucosidase II (GII) is a complex of catalytic α-subunit (GIIα), encoded by GANAB, and a regulatory β-subunit (GIIβ), encoded by PRKCSH. Collectively this contributes to the localization and enzymatic activity of GIIα [11]. Glucosidase II catalyzes the two-step hydrolysis at α1,3-linked glucose-glucose and glucose-mannose residues of high-mannose-type glycans to generate a quality control protein tag on glycoproteins that is recognized by ER chaperones [12]. As such, it acts as a major partner in glycoprotein processing and quality control in the ER [13].
The concept that genetic defects disrupting different subunits of a protein complex result in a very similar phenotype ("guilt by association") is well-accepted [14]. While mutations in GIIβ have been associated with ADPLD, the phenotypic picture for GIIα is less clear. Most studies have linked GANAB to ADPKD and renal cysts were present in most patients [15]. We set out to explore the incidence of GANAB mutations in a large, independent cohort of patients with PLD as dominant phenotype, and to evaluate the effect of GANAB mutations on protein structure and glucosidase II subunit binding.

Patient characteristics
For our molecular inversion probe (MIP) analysis we included a total of 625 patients with ADPKD or ADPLD. Patients were referred to our center for genetic analysis or treatment of PLD by hospitals from the Netherlands, Belgium, Spain, and Denmark. All patients had the primary phenotype of polycystic liver disease (PLD). Patients were diagnosed with ADPKD or ADPLD based on clinical presentation. 90 patients were diagnosed with ADPKD, and 535 patients with ADPLD. Of the 625 analysed patients, 17 families were included that consisted of 62 family members. Patients had not been genetically screened previously.
Of the eight patients with likely pathogenic heterozygous GANAB variants, seven patients presented with ADPLD and three patients had 1-3 kidney cysts, while in the remaining patients no kidney cysts could be detected. One patient was diagnosed with ADPKD. The hepatic phenotype did not differ between patients affected by ADPLD or ADPKD. Six out of eight patients were female and the mean age was 56 (range 31-79) years. Clinical presentation of patients with GANAB variants can be found in Table 1.

Identification of GANAB variants in PLD-affected individuals and families
MIP analysis identified 38 variants in 32 patients, of which 13 heterozygous GANAB variants are possibly pathogenic. We identified one frameshift, one in-frame deletion, one splice site, two nonsense, and eight missense variants that were all validated by Sanger sequencing (Fig. 1, Additional File 1). Notably, none of these variants have been reported before to cause ADPLD or ADPKD [15]. Moreover, registry data show that in-frame deletion c.11_16delTAG CGG (rs750723025) has a global frequency of 9/249,084 exomes (MAF = 0.000036; GnomAD) and is seen in

Segregation analysis of GANAB variants
To bolster the causal relationship between GANAB variants identified by MIP analysis and clinical presentation of PLD, we tested DNA of family members for GANAB variants. We confirmed that frameshift variant c.687delT co-segregated with the disease in family 11-0741. This ADPLD family presented with multiple liver cysts, without renal cysts which suggests that the variant is likely pathogenic.
For the patients with splice variant (c.2002+1G>C), and nonsense variants (c.2509C>T and c.2656C>T) no DNA of family members was available for segregation analysis. Due to the predicted severe nature of these variants we included them in our final group of likely pathogenic variants.
We analyzed two missense variants (c.1852C>T and c.2006A>G) predicted as deleterious, for segregation in affected family members. Missense variant c.1852C>T was present in the affected individual and her unaffected daughter. We did not identify a pathogenic GANAB variant in DNA derived from the affected brother of affected individual nor were mutations in other known genes for ADPLD present. Similarly, for the individual having the c.2006A>G mutation both affected and unaffected family members possessed the GANAB variant. This suggests that neither variant c.1852C>T nor c.2006A>G are causative for PLD in these families.
For the remaining two missense variants (c.38G>A, c.1835G>C) predicted as deleterious no family members were available for further analysis. However variant c.1835G>C p.(Arg612Pro) was reported in two unrelated individuals and is therefore included in the group of likely pathogenic variants.
Three missense variants (c.323C>T, c.1607A>C, c.1883C>G) are predicted to be probably deleterious, the other variant (c.2702C>T) is predicted to be not deleterious by in silico prediction tools. These four variants were discarded as bona fide mutations due to the lack of available family members and the results of the prediction tools.

MIP analysis of PRKCSH, SEC63 and PKD2
We analyzed all patients with a validated likely pathogenic GANAB variant for a pathogenic sequence variation in PRKCSH, SEC63 and PKD2. None of the individuals who carried a bona fide pathogenic GANAB variant was trans heterozygous for any of the other three PLD associated genes. In the family with nonsegregating GANAB variant c.2006A>G, we identified 13 of these variants were validated using Sanger sequencing. The non-validated variants were excluded from further analysis. Of the validated variants, 7 were excluded because these were non-segregating, not predicted to be deleterious by in silico prediction tools, or no family members were available for further co-segregation analysis. 6 variants were included as definitive pathogenic GANAB variants van de Laarschot et al. Orphanet J Rare Dis (2020) 15:302 the PRKCSH variant c.841C>T p.(Arg281Trp), which has previously been described to cause APDLD [16].

Pathogenic GANAB variants influence glucosidase II structure in silico
In order to predict the functional consequences of the identified GANAB variants, we evaluated their effects on the glucosidase II protein complex structure in silico. The relationship between the GANAB variants and the GII function and interaction of GII subunits was studied using 3D homology modelling of the GII protein complex. Structural changes caused by the GANAB variants were predicted and visualized. The complete protein structure and interaction sites of the two subunits of human GII have only recently been unraveled [12,17]. The GII protein complex consists of the 110 kDA catalytic α-subunit and the 60 kDA regulatory β-subunit. GIIα is composed of four major domains and three subdomains. The active site of GIIα can be found in the β 8 α 8 barrel domain [12]. The distal C-terminal domain of GIIα is primarily involved in binding of GIIβ through its N-terminal GIIα-binding domain [17]. The frameshift variant (c.687delT) is located at the N-terminal domain. Modelling predicts that this variant introduces a premature termination codon (PTC) 60 amino acids upstream (p.Asp229Glufs*60) leading to a truncated protein and thus limiting functional protein expression. In frame deletion p.Val4_Ala5del (c.11_16delTAG CGG ) located to the signalling sequence, is predicted to result in degradation of the protein as the protein would be unable to be transported and mature. The amino acid substitution p.Arg612Pro in the β 8 α 8 barrel domain is the consequence of missense variant c.1835G>C. This affects the structure of the domain carrying the active site of GIIα and thus is predicted to reduce its activity. The two nonsense variants p.Arg837* (c.2509C>T) and p.Arg886* (c.2656C>T) result in a truncation of the distal C-terminal domain (Fig. 2). The last variant is located in a splice site region at the distal part of the gene, which is predicted to result in inaccurate pre-mRNA splicing. Most likely this leads to a PTC resulting in a truncated protein.

Discussion
Here we describe 5 novel GANAB variants that can be linked to the presence of PLD [15]. All patients presented severe PLD. Three ADPLD patients possessed 1-3 nonpathogenic kidney cysts, which is in concordance with the prevalence of kidney cysts in the general population [18]. These variants can cause PLD in patients diagnosed with ADPLD or ADPKD. Seven out of eight patients with a confirmed pathogenic GANAB variation in our cohort were diagnosed with ADPLD, the last patient had ADPKD.
Our in silico analysis suggests that the amino acid substitution in the β 8 α 8 barrel domain p.(Arg612Pro) probably results in reduced GIIα enzymatic activity because the active site domain of the subunit is predicted to be disrupted. The variants c.2656C>T p.(Arg886*) and c.2509C>T p.(Arg837*) introduce a PTC that would lead to truncation of the C-terminal domain of GIIα. This could impair interaction with GIIβ and consequently decrease enzymatic activity of GIIα [17,19]. This is in line with an interaction study showing that the Arg834Ala/ Arg835Ala mutation impairs interaction between GIIα and GIIβ subunits [17]. Further, from genetic studies the missense mutation in the distal C-terminal p.Arg839trp was identified to cause ADPKD in an ADPKD family [15]. ER-localization of GIIα is mediated by GIIβ [20]. Our immunofluorescence assay did not show subcellular localization of GIIα. This would imply that variants c.2656C>T and c.2509C>T do not impair GIIα and GIIβ co-localization but may still affect enzymatic activity of GIIα.
The frameshift variant c.687delT p.(Asp229Glufs*60) and the splice variant c.2002+1G>C are likely to result in truncated proteins lacking the C-terminal domain of GIIα and can result in nonsense-mediated mRNA decay that reduces the level of translated protein expressed in the ER. However, no reduced levels of truncated proteins were seen in the transiently transfected cells compared to the wild type suggesting no influence of the PTCs on nonsense-mediated mRNA decay. Because transient transfection of cells can sometimes influence splicing and normal cellular processes due to overexpression of proteins we also performed stable transfection of the variants in which a reduced level of truncated protein of GANAB-INT was observed and therefore possibly indicates some, but not complete nonsense-mediated mRNA decay. Immunofluorescence assay shows no co-localization of both variants with GIIβ and thus incomplete interaction which leads to reduced enzymatic of the GII complex.
Our study knows some limitations. Our study yielded a relative high number of non-validated GANAB variants. This was because all variants with a quality of depth of 500 or higher were included. This lower quality setting was chosen in order to not oversee any bona fide pathogenic variants in GANAB or the other three PLD genes. All cell studies were performed in HEK293 and HeLa cell lines. These cell lines are not specifically used as PLD models. Since the aim of our study was not to model PLD, but to study protein expression and localization of GII protein complex, these cell lines were chosen for their expression of both PRKCSH and GANAB. Cholangiocyte cell lines such as H69 or SkCHA-1 may be closer related to affected cells in PLD patients, but have low transfection rate and were thus deemed less suitable for our studies.
Overall, most identified variants will likely lead to decreased or complete loss of functional glucosidase II. Trimming of N-glycan glucose residues by glucosidase II is necessary for recognition by ER-resident chaperone proteins calnexin and calreticulin, and subsequent protein folding and secretion [21]. Therefore, functionality of glucosidase II is essential for correct protein maturation. We can hypothesize that defective glucosidase II directly affects polycystin-1 folding and maturation, and in this way causes cyst formation in the liver and/or kidney. However, as glucosidase II-driven protein maturation is not limited to polycystin-1, the absence of disease symptoms due to other disrupted cellular functions or organs remains unexplained. Also, it remains unclear what factors contribute to the differential expression of cysts in the liver and/or kidney. This should be investigated in future studies.

Conclusions
Our findings have two important implications. First, GANAB is the only gene that has been shown to cause polycystic liver disease in patients with ADPLD or ADPKD. This finding supports the standing hypothesis of a common pathway of hepatic cystogenesis in patients with ADPLD and ADPKD [22]. In vitro studies showed that mutations in genes associated with ADPLD result in defective maturation and trafficking of polycystin-1 [7,15]. The level of functional polycystin-1 at the cilium is postulated to be central in the development of both hepatic and kidney cysts [10]. If the level of functional polycystin-1 drops below a critical threshold, cyst development initiates. Furthermore, a more severe reduction of polycystin-1 function causes more severe cystic disease [6,23]. Maturation of polycystin-1 is dependent on polycystin-2 chaperone function [24]. Moreover, the abundance of polycystin-2 is determined by N-glycosylation mediated by glucosidase II [25].
Second, our findings have some implications for clinical practice. Our cohort contains a large number of unrelated patients from different countries in Europe, which indicates that our results are a good reflection of the prevalence of GANAB variants in the European PLD patient population. In our study population the detection rate of bona fide pathogenic GANAB variants is ~ 1%. The detection rate in our total group of PLD-affected patients is low, nonetheless this number is similar to the detection rate of ADPLD associated gene LRP5 [9]. We suggest that genetic screening should include a screen for GANAB variants in ADPLD and ADPKD patients with unknown genetic background.
To conclude, we identified five novel GANAB variants associated with PLD in ADPKD and ADPLD patients. These variants may affect functionally important domains of GIIα and lead to decreased or complete loss of enzymatic activity of glucosidase II.

Sample and data collection
We collected clinical data and biomaterials from PLD patients in our PLD registry [26]. All participants of the PLD cohort provided informed consent for DNA analysis related to PLD studies. Blood samples were collected from patients and all available family members. Genomic DNA was isolated from the blood samples by standard methods with High Pure PCR Template Kit (Roche, Basel, Switzerland). Clinical and imaging data were obtained through review of clinical records.

Editorial policies and ethical considerations
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of the Radboud University Medical Center (CMO regio Arnhem-Nijmegen; 2001-218) and all participants gave informed consent.

Molecular inversion probe analysis
We used molecular inversion probe (MIP) analysis for DNA analysis of GANAB (RefSeq NM_198335), PRKCSH (RefSeq NM_001289104), SEC63 (RefSeq NM_007214), and PKD2 (RefSeq NM_000297). PRKCSH, SEC63, and PKD2 were included in this analysis to exclude that PLD was caused by any of the known genes in these individuals. Due to technical difficulties caused by genomic duplicates in the first 34 exons, PKD1 was not included in the final MIP library. The final pooled MIP library comprised 174 probes (GANAB n = 54, PRKCSH n = 33, SEC63 n = 45, PKD2 n = 42). The average coverage per probe was GANAB 1277, PRKCSH 263, SEC63 1369, PKD2 759. Capture regions were sequenced in two runs using 112-bp reads on a NextSeq500 sequencer (Illumina). MIP data variants were identified as described before [27]. Briefly, reads were aligned by Burrows-Wheeler Aligner to the reference genome. After MIP extension and ligation arms were removed from all the alignment files, reads were trimmed to remove overlap between the paired-ends. The identified variants were filtered on gene component (exon, acceptor site canonical, splice donor site canonical), synonymous (false), and zygosity (heterozygote) to exclude all false positive and non-changing variants. Variants that passed all filtering steps were validated using Sanger sequencing.
Variants found with MIP analysis were compared to the obtained Sanger sequences with Alamut Visual (v2.7). The functional significance of novel mutations was assessed with bioinformatics prediction tools (Poly-Phen-2, Align GVGD, SIFT, MutationTaster). Sanger sequencing was performed in all available family members for those patients with validated mutations. Validated variants were submitted to ClinVar database.

In Silico analysis and homology modeling
Homology models of the human GIIα were created using the experimentally solved 3D-structure of the murine GIIα as a template (Protein Data Bank file 5F0E) [28]. The human and mouse sequences show 92% sequence identity over 851 amino acids. The models were created using the automatic modelling script with standard parameters of the YASARA & WHAT IF Twinset [29,30]. The resulting model is complete, except for the N-terminal signal peptide.

GANAB constructs
GANAB expression clones were generated using the Gateway Cloning system (Thermo Scientific, Carlsbad, CA). Entry clone encoding full-length human GANAB (2901 bp) was generated from cDNA of liver tissue. Total RNA was isolated from liver tissue using TRIzol Reagent (Thermo Scientific), and cDNA was generated by RT Transcriptor First Strand cDNA synthesis kit (Roche Applied Sciences, Mannheim, Germany) using a GANAB specific reverse primer. Full-length GANAB-WT fragments were obtained using Platinum ™ SuperFi ™ DNA Polymerase (Thermo Scientific) using forward and reverse primers including the attB1 and attB2 sequence. The GANAB-WT fragment was then cloned into the Gateway entry vector pDONR201. The entry clones of GANAB including wild type and mutant intron 17 were generated by replacing a BglI/KpnI fragment of pDON-R201GANAB-WT by a fragment produced by PCR on patient genomic DNA having a heterozygous intronic mutation using an exon 14 forward and exon 18 reverse primer followed by digestion with BglI and KpnI. All GANAB entry clones were subsequently cloned into the GW331 N-terminal FLAG expression vector. GANAB constructs harboring exonic variants were generated using the Quick Change-II-XL Site-Directed Mutagenesis Kit (Agilent Technologies, Santa Clara, USA) and the GW331GANAB-WT as a template. Sequences of all constructs were confirmed by Sanger sequencing.

Transfection of cells
HEK293 and HeLa cells were cultured in DMEM supplemented with 10% FCS, 1% NEAA, 50 µg/mL Gentamicin, and 10 mM HEPES (all Thermo Scientific) at 37 °C and 5% CO 2 in a humidified incubator. For DNA transfections, HEK293 and HeLa cells were seeded in a 6-wells plate (500,000 cells per well) for Western blotting and HeLa cells also in a 24-wells plate containing poly-l-lysine coated glasses (100,000 cells per well) for immunofluorescent staining. The following day cells were transfected with GW331GANAB constructs using FuGene HD (Promega, Madison, WI). For stable expression of GIIα, normal medium of cells was replaced by medium containing 500 µg/ mL G418 (Thermo Scientific) after 2 days. After 2 weeks resistant clones were transferred to a new well and tested for stable expression of GIIα by Western blotting.

Western blot analysis
Cells were lysed using lysis buffer (50 mm Tris-HCl (pH7.5), 150 mm NaCl, 1% NP-40, protease inhibitor tablet (Roche)) on ice 72 h after transfection. After adding sample buffer to the lysed cells, samples were heated for 5 min at 95 °C. Samples were run on a 10% SDS-PAGE gel and transferred semi dry onto a nitrocellulose membrane. Immunostaining was perfomed with anti-Flag as primary antibody and swine anti-rabbit HRP as a secondary antibody (DAKO, Glostrup, Denmark). Proteins were visualized using Clarity Western ECL Substrate (Biorad) and the Proxima C18 imaging system (Isogen, de Maarn, Netherlands). Western Blots were quantified using Totallab quant (Totallab, Newcastle-Upon-Tyne, United Kingdom).

RNA isolation and RT-qPCR
Total RNA was isolated using TRIzol according to manufacturer's protocol. We reverse transcribed RNA into complementary DNA (cDNA) using iScript cDNA synthesis kit according to the protocol (Biorad). 1 mL of resulting cDNA was used for RT-qPCR. Briefly, the RT-qPCR was carried out on an thermal cycler (CFX96, Biorad), using the 2_DDCt SYBR green protocol.We amplified GANAB and beta actin (reference) in 40 cycles. RT-qPCR data were analyzed using the CFX-Manager software, which validates primer quality by analyzing melting curves. All the RT-qPCRs were performed with triplicates for each sample.