Galactosialidosis: review and analysis of CTSA gene mutations

Background Mutations in the CTSA gene, that encodes the protective protein/cathepsin A or PPCA, lead to the secondary deficiency of β-galactosidase (GLB1) and neuraminidase 1 (NEU1), causing the lysosomal storage disorder galactosialidosis (GS). Few clinical cases of GS have been reported in the literature, the majority of them belonging to the juvenile/adult group of patients. Methods The correct nomenclature of mutations for this gene is discussed through the analysis of the three PPCA/CTSA isoforms available in the GenBank database. Phenotype-genotype correlation has been assessed by computational analysis and review of previously reported single amino acid substitutions. Results We report the clinical and mutational analyses of four cases with the rare infantile form of GS. We identified three novel nucleotide changes, two of them resulting in the missense mutations, c.347A>G (p.His116Arg), c.775T>C (p.Cys259Arg), and the third, c.1216C>T, resulting in the p.Gln406* stop codon, a type of mutation identified for the first time in GS. An Italian founder effect of the c.114delG mutation can be suggested according to the origin of the only three patients carrying this mutation reported here and in the literature. Conclusions In early reports mutations nomenclature was selected according to all CTSA isoforms (three different isoforms), thus generating a lot of confusion. In order to assist physicians in the interpretation of detected mutations, we mark the correct nomenclature for CTSA mutations. The complexity of pathology caused by the multifunctions of CTSA, and the very low numbers of mutations (only 23 overall) in relation to the length of the CTSA gene are discussed. In addition, the in silico functional predictions of all reported missense mutations allowed us to closely predict the early infantile, late infantile and juvenile phenotypes, also disclosing different degrees of severity in the juvenile phenotype.


Background
The protective protein/cathepsin A (PPCA or CTSA) is a multifunctional lysosomal enzyme with distinct protective and catalytic function [1].
Association with PPCA/CTSA in an early biosynthetic compartment ensures the correct lysosomal transport, activation and stability of GLB1 and NEU1 [6]. This defines the enzyme protective function. On the other hand, studies of the physiological role of PPCA/CTSA as a serine carboxypeptidase/deamidase/esterase demonstrated a role of the enzyme in the inactivation of selected neuropeptides, like substance P, oxytocin and endothelin I [7]. PPCA/ CTSA is also responsible for the proteolytic inactivation of Lysosome-associated membrane protein (LAMP)2a, a lysosomal integral membrane protein involved in chaperone mediated autophagy, thus regulating this lysosomal pathway of protein degradation [8].
Mutations in the CTSA gene are the cause of the lysosomal storage disease galactosialidosis (GS). Loss of function of PPCA/CTSA results in the secondary combined deficiency of GLB1 and NEU1, which is the biochemical hallmark of the disease. Patients with GS present with a broad spectrum of clinical manifestations, but are usually classified as early infantile, late infantile or juvenile/adult type based on the age of onset and severity of their symptoms.
The early infantile phenotype, the most severe form of the disease, usually presents with hydrops fetalis, cherry red spots, visceromegaly, psychomotor delay, coarse facies, skeletal dysplasia, and early death. Late infantile forms are characterized by corneal clouding, cardiac involvement, visceromegaly and, rarely, psychomotor retardation. Most patients with the milder juvenile/adult form, exhibited myoclonus, ataxia, neurological deterioration, angiokeratoma, and absence of visceromegaly [9,10].
It is still unclear whether the catalytic function of PPCA/CTSA contributes to particular clinical signs of GS. In this respect, impaired LAMP2a degradation due to PPCA/CTSA deficiency may be linked to the low weight of affected individuals [8] and the lack of inactivation of specific bioactive peptides may play a role in the regulation of the blood pressure [7].
In addition, it has been previously suggested that PPCA/ CTSA plays a role in elastic fibers assembly, through its association with the enzymatically inactive, spliced variant of βgalactosidase, known as the elastin binding protein (EBP). Because EBP acts as an intracellular chaperone for tropoelastin, facilitating the trafficking and deposition of elastic fibers [11], lack of PPCA/CTSA in GS can be accompanied by alterations in elastogenesis, affecting the cardiovascular and respiratory systems [7,12,13]. This is especially true for GS patients with a longer survival, as they need periodic assessment of their pulmonary function and emphysema, linked to a defect in elastic fiber assembly [10].
Because PPCA/CTSA is present in the LMC, mutations altering the folding of one protein in the complex can influence the other components as well [14,15]. Multiple sequence alignments may predict functional sites or domains that may favor intra-or inter-molecular interactions within the LMC. Analogously, structural web applications may predict an amino acid substitution as disease-causing or neutral in humans, and it could suggest the molecular causes of a disease. For instance, gain of helical propensity or loss of a phosphorylation site or disorder to order transitions caused by Molecular Recognition Features (Morf ), specific regions of proteins that exhibit molecular binding functions [16].
A total of 23 CTSA gene mutations have been reported (HGMD professional https://portal.biobase-international. com/cgi-bin/portal/login.cgi). These include deletions, missense and splicing mutations, but no nonsense mutations. Alternatively spliced transcripts of the CTSA gene have been described, and at least three CTSA mRNA sequences have been deposited in the GenBank database (RefSeq, http://www.ncbi.nlm.nih.gov/gene/5476). In early reports mutations nomenclature was selected according to all CTSA isoforms, which generated confusion in the way CTSA mutations were reported. Some mutations were numbered based on a PPCA sequence that was either 46 (missing all the amino acids of the signal peptide) [17][18][19][20] or 18 [13,21] amino acids shorter than that reported in the HGMD professional database.
In order to assist physicians in the interpretation of detected mutations, we review and discuss previously reported CTSA transcripts, underlying erroneous or current nomenclature. We also present the molecular and clinical assessment of four new observations of the rare infantile form of GS. Computational analyses to predict a role of the new and/or previously reported missense mutations are discussed in order to address the clinical and molecular implications of the CTSA defects in GS patients.

Cell culture
Skin fibroblasts from patient 3, and normal controls were cultured in Ham's F-10 medium supplemented with 10% fetal bovine serum and antibiotics. Cell lines from the remaining patients we report are no longer available.

Enzyme assays
The Micro BCA protein Assay kit (Pierce Rockford, USA) was used to set up the starting proteins used in each enzyme assay. GLB1, NEU1, PPCA and GALNS activities were measured in cell lysates of fibroblasts and/or leukocytes using commercially available fluorogenic substrates (Moscerdam substrates, Netherlands) and according to manufacturer's instructions (www.moscerdam.com). For NEU1 assays: cells lysates, BCA measurements and assays were performed in fresh samples, lysed by pipetting.

PCR amplification of genomic DNA and informed consents
Genomic DNA was extracted from the patient's fibroblasts and/or lymphocytes using standard methods. The genomic fragments covering all 15 exons and the exon/ intron boundaries of the CTSA gene were amplified by a set of primers located in flanking intronic sequences. PCR amplifications were performed under the previously reported conditions [13]. Informed consents for genetic tests were obtained for all analysed patients included in the study.

DNA sequencing
PCR fragments were separated on a 2% agarose gel and the bands were visualized using a UV transilluminator. DNA products were purified by Nucleospin Extract kit (Macherey-Nagel, Düren, Germany), following the manufacturer's protocol. The double-stranded purified products were used for direct sequencing with the same PCR amplification primers. Sequencing reactions were performed using the ABI PRISM 3130 Genetic Analyzer (Applied Biosystems, Foster City, U.S.A.) as recommended by the manufacturer.

Screening of new mutations and in silico analyses
The 1000 Genomes project database (http://www.1000 genomes.org/) including all human genetic variations from the dbSNP short genetic variations database (http:// www.ncbi.nlm.nih.gov/omim) did not report the frequency of the newly presented point mutations. In order to analyse the actual frequency of such variants in the Italian population, the CTSA gene of 60 normal control DNA samples was analysed by sequencing analysis. The PCR fragments were amplified by the genomic primers reported earlier [13]. In addition, the single amino acid substitutions were also analysed by SIFT (http://sift.jcvi. org/www/SIFT_aligned_seqs_submit.html) and PolyPhen (http://genetics.bwh.harvard.edu/pph/) multiple sequence alignments of CTSA related proteins. The functional effect of novel missense mutations on resulting CTSA enzymes was predicted by MutPRed (http://mutpred.mutdb.org/) web site.

Results
Clinical and biochemical data of the four GS patients here reported The clinical data and mutation analysis of the new GS patients reported here are summarised in Table 1. All patients showed vacuolated lymphocytes in peripheral blood smears. A marked reduction of GLB1 and absence of NEU1 activity confirmed the diagnosis of GS on fibroblasts and/or lymphocytes (Table 1). Biochemical data were performed in different accredited diagnostic laboratories (thus control values are different), but NEU1 activity was nearly absent in all patients while patients' GLB1 activity ranged from 2 to 19% of normal values. CTSA activity, measured in Pt3 fibroblasts, was also completely absent (normal value 191-482 nmol/mg/ min). In contrast, GALNS assays performed in both lymphocytes and fibroblasts of Pt3 was within the normal range. Indeed, functional protein association networks as Gene Mania (www.genemania.org/) and String (http:// string-db.org/) showed that GALNS interacts at least with NEU1 but not directly with PPCA/CTSA inside the LMC (Figure 1).
Common symptoms in all four patients included facial dysmorphisms, hypotonia, developmental delay, failure to thrive, liver-spleen-megaly and ascites (Table 1). Respiratory distress and cardiac involvement, when investigated, were also observed. Cardiomyopathy was detected in Pt1. Pt3 was intubated for respiratory failure at birth. He also exhibited reduced cardiac contractility and cardiac dilatation with stiff and dilated inferior vena cava at the 2nd day of life. Pt4 presented with hypertrophic interventricular septum, which evolved into dilated cardiomyopathy. Skeletal involvement was also noticed in this patient since he was one week old. He exhibited a rarefaction of femoral, tibial and heels metaphysis together with horizontalysed acetabula. Brain magnetic resonance imaging performed in Pt3 and 4, identified alterations in brain structure and blood flow (Table 1).

In silico evaluation of CTSA isoforms and mutation nomenclature
In order to avoid confusion in mutation nomenclature, we critically examine reported CTSA variants (RefSeq, http://www.ncbi.nlm.nih.gov/gene/5476). Transcript variants and their differences are outlined in Figure 2.
Since Variant 1 has been chosen as the reference sequence for the mutation nomenclature in HGMD professional (https://portal.biobase-international.com/cgi-bin/ portal/login.cgi), reported mutations have to be listed according to NM_000308.2 CTSA variant (variant 1, isoform a), as in the current report.

Sequencing analysis of the CTSA gene
The results of mutation analyses of the GS patients reported here are summarised in Table 1.   Table 2.
Pt 1 and Pt3 are both heterozygous for a nonsense mutation and a deletion, thus their second allelic mutations, respectively p.Val150Met and p.His116Arg, can be linked to the severe early infantile phenotype (Tables 1  and 2).

In silico CTSA functional/structure predictions of missense mutations
In Table 2 we have reviewed the clinical phenotypes and genotypes of previously reported GS patients and included the functional/structural predictions of newly identified as well as previously described CTSA missense mutations ( Table 2).
The clinical features of previously reported homozygous patients allowed for a close genotype-phenotype correlation ( Table 2). The p.Tyr413Cys mutation was identified in both early infantile and adult patients. However, in the adult patient it was detected in combination with the c.746 + 3A > G change, reported as a mild mutation in homozygotes adult patients [21,24]. Thus, the p.Tyr413Cys mutation underlies the severe phenotype. The p.Tyr267Asn was detected in a "variant" form of an early infantile patient without neurological involvement [21]. The same mutation was later identified in a patient with the late infantile form of the disease [22] and marginal neurological involvement, if any [9,20]. The p.Gln67Arg was first reported in a juvenile patient [21], in combination with a mild mutation. The early infantile phenotype was linked to such mutation by structure analysis [19], which was confirmed by a subsequent clinical report [25].
Prediction tools indicated that the p.Ser69Val, p.Val83Arg and p.Gly103Val, detected in juvenile patients in combination with mild mutations or reported in non-assigned clinical phenotypes, can be linked to severe outcomes. Coarse facies, hepatosplenomegaly, growth retardation and an unusual renal symptomatology were described in a 9-year-old patient who was compound heterozygous for the p.Gly103Val and p.Arg442Trp mutations [18]. Since this patient had normal neurological development, the diagnosis of late-infantile GS can be supposed and probably linked to the p.Arg442Trp mutation. It was reported earlier that the p.Tyr267Asn correlated with a more severe phenotype than that associated with the p.Phe458Val [22] unless both mutations were found in the juvenile/adult form of the disease. This assumption seems to be correct based on previously reported structure prediction [19] (Table 2).

Discussion
GS is a rare lysosomal storage disease with most of the described patients having the juvenile/adult form [9,10]. The prevalence of GS is unknown; more than 100 observations have been reported but only 23 mutations have been identified so far, including point-mutations and rearrangements, but no nonsense mutations [9,10]. Here we report the first case of GS carrying a CTSA gene nucleotide change leading a stop codon mutation. The rarity of such mutation type in the GS patient population is likely attributable to the rare incidence of the early infantile form of the disease.
We also describe two new missense mutations: c.347A>G (p.His116Arg), and c.775T>C (p.Cys259Arg).   The p.His116Arg mutation was detected in combination with the c.114delG, a deletion found at the homozygous level also in Pt4 and previously reported in an unrelated patient originating from the same area of central Italy [13]. Since this mutation was not reported in patients with other origin in the literature and due to the small number of previously reported mutations, a founder effect for such onebase deletion can be hypothesized.
As mentioned earlier, three different CTSA cDNAs transcripts are deposited in the GenBank database, thus the past literature on CTSA mutations referred randomly to the different isoforms, which has been the source of confusion. For example, the p.Val150Met mutation has been first reported as p.Val104Met [22] and then reported as p.Val132Met [23]. We would like to stress the importance of having the CTSA mutation nomenclature homologated to the HGMD professional guide lines, which choose CTSA Variant 1 as the reference sequence for mutation nomenclature of the CTSA resulting protein.
To deepen the correlation between mutations and their effects on the CTSA protein structure we performed computational analyses using in silico tools based on phylogenetic alignments and functional/structural predictions. Multiple alignments of related sequences among organisms and structural web applications help to identify regions or domains that are conserved and may indicate functional constraints.
We found computational analyses to be helpful in improving the determination of the pathognomonic effects of newly identified nucleotide variants and genotypephenotype correlations. This method has been particularly useful in the case of compound heterozygous mutations reported in mild affected patients or when clinical data are insufficient, i.e. if expression studies and/or structural analysis of compound heterozygous mutations are not available. The reverse process: from a described phenotype to computational analysis showed a good correlation between in silico predictions and mutation severity.
We want to emphasize that predictive functional/structural analyses of mutant proteins and phenotypes are closely related, often providing clearcut indications on the specific form of the disease (early infantile, late infantile, juvenile). We found such analyses of use to predict the extent of disease severity related to two mutations p. Tyr267Asn and p.Phe458Val found in juvenile GS patients. In contrast, phylogenetic comparisons in most cases provide indications that are limited to the pathogenic/non-pathogenic effects of mutations without further details. Thus, the combination of multiple computational analyses is an effective strategy.
CTSA binds and regulates GLB1 and NEU1 inside lysosomes [6]. Structure predictions, identifying mutations that alter Molecular Recognition Features (MoRFs) or result in gain or loss of function in CTSA, could accordingly provide information on the three-dimensional structure of the complex proteins. These findings might provide indications on the pathogenetic effects of mutations and on the interactions between the proteins within the LMC.
Known and predicted protein-protein interactions evaluated by String and GeneMania prediction software evidenced that PPCA/CTSA does not directly interact with GALNS thus corroborating the molecular data showing that the PPCA/CTSA deficit does not affect the GALNS activity in Pt3. However our findings can not exclude the involvement of GALNS in the LMC after the binding with NEU1.
The protective function of PPCA dramatically affects both GLB1 and NEU1, leading to a broad range of clinical manifestations, worsening with age [2]. PPCA defects causes both glycosidases to malfunction, indeed early infantile forms of GS share clinical signs observed in the infantile forms of both GM1 gangliosidosis and type II sialidosis [19]. In addition, altered catalytic activity of PPCA could contribute to the variability of symptoms due to the potential esterase/deamidase activity of PPCA in platelets, endothelial cells, heart and kidney [3]. A cardiovascular role of PPCA could also be linked to the altered function of EBP due to structural mutations of PPCA as its binding partner at the plasma membrane [11]. Our prediction tools identified only one CTSA amino acid change (p.Val150Met) that could affect its catalytic site. This mutation, detected in combination with the p.Gln406* change, is linked to the early infantile phenotype with fetal hydrops and cardiac involvement.
CTSA gene, spanning about 43.000 bp and containing 15 exons, is about twice the average length of human genes. However, reported CTSA mutations are a very low number. The reason for which a region or site have a higher or lower mutation rate is poorly understood, except in the case of CpG islands where cytosine can become methylated and unstable, leading to a higher rate of mutation [26].
The majority of CTSA genetic lesions occur at positions that are evolutionarily conserved [27], and nonvariable sites may indicate protein sequences under more selective constraints [28]. Thus, the PPCA/CTSA paradigm could be useful both at genetic level, identifying base composition bias around CpG dinucleotides, distribution and rate of single nucleotide polymorphisms, and at functional level, giving structural consequences of mutated amino acids and regions.

Conclusions
In early reports mutations nomenclature was selected according to all CTSA isoforms (three different isoforms), thus generating a lot of confusion. In order to assist physicians in the interpretation of detected mutations, we underline the correct nomenclature for CTSA mutations.
Four cases with the rare infantile form of galactosialidosis are here detailed and three novel nucleotide changes were identified, one of them resulting in a stop codon, a type of mutation identified for the first time in galactosialidosis.
We also present some data on brain magnetic resonance, never detailed so far in galactosialidosis. Likewise, predictive functional/structural analyses of mutant proteins and phenotypes have been shown to be closely related, often giving clearcut indications on the specific form of the disease (early infantile, late infantile, juvenile).
The complexity of the clinical phenotypes in GS reflects the dual functions of PPCA/CTSA (catalytic and regulating/protective) and thus its functional role in both lysosomal and cell membranes. Further three-dimensional studies can provide additional information on functional domains, on protein-protein interactions within the lysosomal and the non-lysosomal complexes and on the onset and progression of symptoms.