Mutational screening of the USH2A gene in Spanish USH patients reveals 23 novel pathogenic mutations

Background Usher Syndrome type II (USH2) is an autosomal recessive disorder, characterized by moderate to severe hearing impairment and retinitis pigmentosa (RP). Among the three genes implicated, mutations in the USH2A gene account for 74-90% of the USH2 cases. Methods To identify the genetic cause of the disease and determine the frequency of USH2A mutations in a cohort of 88 unrelated USH Spanish patients, we carried out a mutation screening of the 72 coding exons of this gene by direct sequencing. Moreover, we performed functional minigene studies for those changes that were predicted to affect splicing. Results As a result, a total of 144 DNA sequence variants were identified. Based upon previous studies, allele frequencies, segregation analysis, bioinformatics' predictions and in vitro experiments, 37 variants (23 of them novel) were classified as pathogenic mutations. Conclusions This report provide a wide spectrum of USH2A mutations and clinical features, including atypical Usher syndrome phenotypes resembling Usher syndrome type I. Considering only the patients clearly diagnosed with Usher syndrome type II, and results obtained in this and previous studies, we can state that mutations in USH2A are responsible for 76.1% of USH2 disease in patients of Spanish origin.


Usher syndrome
Usher syndrome (USH) is an autosomal recessive disease characterized by hearing loss, retinitis pigmentosa (RP), and, in some cases, vestibular dysfunction. It is clinically and genetically heterogeneous and is the most common cause underlying deafness and blindness of genetic origin. Clinically, USH is divided into three types. Usher type I (USH1) is the most severe form and is characterized by severe to profound congenital deafness, vestibular areflexia, and prepubertal onset of progressive RP. Type II (USH2) displays moderate to severe hearing loss, absence of vestibular dysfunction, and later onset of retinal degeneration. Type III (USH3) shows progressive postlingual hearing loss, variable onset of RP, and variable vestibular response. To date, five USH1 genes have been identified: MYO7A (USH1B), CDH23 (USH1D), PCDH15 (USH1F), USH1C(USH1C), and USH1G(USH1G). Three genes are involved in USH2, namely, USH2A (USH2A), GPR98 (USH2C), and DFNB31 (USH2D). USH3 is rare except in certain populations, and the gene responsible for this type is USH3A.

Background
Usher syndrome (USH) is an autosomal recessive disease characterized by the association of hearing loss and visual impairment due to retinitis pigmentosa (RP), with or without vestibular dysfunction. It is the most frequent cause of concurrent deafness and blindness of genetic origin and its general prevalence ranges from 3.3 to 6.4 per 100.000 live births [1]. In Spain, the estimation is 4.2/100.000 [2].
USH is clinically and genetically heterogeneous. Three clinical forms are distinguished: USH1, USH2 and USH3 and nine genes have been identified responsible so far. Five causative genes have been reported for USH1: MYO7A, USH1C, CDH23, PCDH15 and USH1G. Three genes for USH2: USH2A, GPR98 and DFNB31. Meanwhile only one gene has been described for USH3: USH3A [3,4].
USH2 appears to be the most common clinical form of the disorder, accounting for more than 50% of all Usher cases [5,6]. Among the three genes described for USH2, USH2A is the most commonly mutated gene. It is responsible for approximately 74-90% of USH2 cases [2,7]. Mutations in USH2A, are also responsible for atypical Usher syndrome and recessive non-syndromic RP [8,9]. The USH2A gene, located on chromosome 1q41 [10], was initially described as comprising 21 exons, encoding a protein of 1546 amino acids [11,12]. However, in 2004, van Wijk et al. (2004) identified 51 additional exons at the 3' end of USH2A [13]. The longest transcript consists of 72 exons, encoding a protein of 5202 amino acids. In addition, Adato et al. (2005), identified an alternative spliced exon 71 in mouse transcripts, expressed in the inner ear and well conserved in vertebrates [14]. The long isoform b is characterized by containing a transmembrane region, followed by an intracellular domain with a PDZ-binding motif, which interacts with the PDZ domain of harmonin and whirlin, integrating USH2A into the USH protein network [14,15].
In the present study we have performed an exhaustive mutational screening of the long isoform b of USH2A to identify new patients with mutations in this gene, and to detect the second mutation in patients with one previously detected USH2A mutation. Some cases had previously been studied for exon 13 or for the 21 first USH2A exons [19,9], or analyzed using the genotyping microarray for Usher syndrome (Asper Biotech, Tartu, Estonia; [32]). Furthermore, we have used in silico and in vitro analysis to evaluate the functional consequences on gene expression and protein function of several nucleotide changes.

Subjects
Eighty-eight (88) unrelated Spanish patients diagnosed of Usher syndrome were included in this study. They were recruited from the Federación de Afectados de Retinosis Pigmentaria de España (FARPE) and also from the Ophthalmology and ENT Services of several Spanish Hospitals as part of a large-scale study on the genetics of Usher syndrome in the Spanish population.
On the basis of their clinical history and ophthalmologic, audiological, neurophysiological and vestibular tests, 58 of these families were clinically classified as USH2 while 11 displayed atypical Usher syndrome. Detailed clinical data could not be obtained for 19 patients and these remained as non classified (USHNC).
Previously, 40 of these 88 patients were studied for exon 13, while 24 were analysed for the first 21 exons of USH2A and 42 were analyzed with the genotyping microarray for Usher syndrome (Asper Biotech, Tartu, Estonia). At that time, the version of the array detected 429 previously described mutations in eight of the nine genes reported for the disease. As a result of these previous analyses, eighteen of them were found to carry one mutated allele, but the second mutation could not be detected. These mutations have been included in the statistical summaries presented herein. These patients were subjected to mutation screening of the exons that had not been analyzed. In the remaining patients, we carried out the study of exons 2-72 (including the alternatively spliced exon 71).
When DNA samples from patients' relatives were available, we carried out a segregation analysis.
One hundred unrelated individuals of Spanish origin without hearing loss or RP family history were screened as controls to evaluate the frequency of the mutations found in the patient sample.

Mutation analysis
Genomic DNA from patients and controls was extracted from peripheral blood samples following standard protocols. The coding exons and flanking intronic sequences of USH2A were amplified by PCR using primers and conditions described by Aller et al. (2004; [9,23]. The amplified DNA fragments were analysed by direct sequencing using the Big Dye Terminator v.3.1 kit (Applied Byosistems, Carlsbad, CA), and purified sequencing reactions were analysed in an ABI PRISM 3730 DNA analyzer (Applied Byosistems, Carlsbad, CA).
The obtained sequences were compared with the consensus sequence NM_206933.2. The +1 position corresponds to A in the ATG translation initiation codon.

Predictions of the pathogenic effect of missense variations
To predict whether a rare missense variant is deleterious, we used the combined results of three different computer algorithms: -Sort Intolerant From Tolerant (SIFT) (available at http://sift.jcvi.org) uses sequence homology to predict whether a change is tolerated or deleterious.
-The polymorphism phenotyping program, PolyPhen (available at http://genetics.bwh.harvard.edu/pph/) uses sequence conservation, structure and SWISS-PROT annotation to characterize an amino acid substitution as benign, possibly deleterious or probably deleterious.

Minigene constructions and expression
Minigene constructs were generated, using the exon trapping expression vector pSPL3. For each mutation, the exon and intronic flanking sequences were amplified from the patient's DNA, using the High Fidelity Phusion polymerase (Finnzymes, Espoo, Finland). Amplicons were inserted between the XhoI/NheI and XhoI/BamHI restriction sites for the variants p.E2496E and p.V382M, respectively, using T4 DNA ligase (Invitrogen Corporation, Carlsbad, CA). The p.V382M mutation was generated by site-directed mutagenesis. All vectors were confirmed by direct sequencing. The minigene constructs were transfected into COS-7 cells as described before [33]. RNA extraction and RT-PCR analysis was perfomed as previously described [34,31]. Missplicing percentages were measured using the Alpha Imager 2200 (version 3.1.2) software (AlphaInnotech Corporation, San Francisco, CA, USA).

Results
The molecular analysis of the USH2A gene in 88 unrelated USH Spanish patients revealed 37 different pathogenic mutations. Among these, a total of 23 mutations were novel (See Tables 1, 2 and 3). At least one pathogenic mutation was found in 43 out of 88 unrelated patients (48.9%). Thirty-three patients were classified as USH2, five as USHA (atypical Usher syndrome) and five as USHNC (Usher syndrome non classified). In 25 out of these 43 cases the two causative mutations were detected (58.1%), five patients were homozygous and 20 compound heterozygous. Detailed clinical manifestations of these 25 patients and 3 additional patients with one pathogenic and one probably pathogenic mutation (UV3; likely to be pathogenic but cannot formally be proven) are summarized in Table 4.
In this study, a total of 144 variants were detected: 25 were truncating mutations and five were splice-site mutations (located at the conserved AG/GT dinucleotides of the splice site). The pathogenic effect of these variants is clear. But, in addition, 48 missense, 20 silent and 46 intronic variants were identified. According to previous studies, allele frequencies, segregation analysis, bioinformatics' predictions and in vitro experiments, the missense, silent and intronic changes were classified into 4 different categories: pathogenic, possibly pathogenic (UV3), possibly non-pathogenic (possibly neutral, UV2) and non-pathogenic (neutral). (See Tables 1, 2,3 and 5).
Four missense variants were classified as possiblypathogenic (UV3). The variants p.R303H and p.Y1992C were described previously [https://grenada.lumc.nl/ LOVD2/Usher_montpellier, [35]]. The novel change p. N3894D was not found in 200 control alleles and the segregation analysis proved that it co-segregates with the disease. However, only one program considered it as clearly pathogenic (See Table 6). The new p.V382M change, which affects the first base of exon 7, was not found in control samples and it was predicted to slightly affect splicing. The minigene assays only revealed a mild increase of the transcript excluding exon 7 ( Figure 1, band d) when the variant was present, in comparison to the wild-type sequence.
Finally, six missense mutations were considered as pathogenic. p.C759F and p.C3267R were already described by others authors as damaging [https://grenada.lumc.nl/ LOVD2/Usher_montpellier, [35]]. p.C3358Y and p.P4818L were classified by McGee et al. (2010) [30] and the LOVD-USH Database as likely-pathogenic (UV3). However, in the present study, p.C3358Y was detected in a patient together with another nucleotide change (p.C3267R) and the segregation analysis confirmed that the mutations were not in the same allele. p.P4818L was detected in a patient together with two other mutations that directly or indirectly cause a truncated protein (p.Q3368X and c.5278delG). The segregation analysis confirmed that the deletion and the nonsense mutation were in cis and the missense variant was in the other allele and cosegregated with the disease. The segregation analyses support the damaging effect of p.C3358Y and p.P4818L, so we have considered them as pathogenic. p.   Table 6).

Silent variants
We also identified 20 silent variants, 17 were previously described as neutral [ [36], https://grenada.lumc.nl/ LOVD2/Usher_montpellier, [35]] and three were novel (See Table 3). Only the variant p.E2496E was categorized as pathogenic. It was not found in 200 control alleles and the segregation analysis confirmed that this mutation co-segregates with the disease. We detected this variant in trans in a patient who also had a premature stop codon. According to in silico analyses, it was predicted to create a de novo donor splice site (data not shown). The splicing alteration was confirmed using hybrid minigenes. The mutant construct generated a transcript that lacked the last 106 nucleotides of exon 40 ( Figure 1, band b). This loss of nucleotides creates a new open reading frame, leading to a premature stop codon five amino acids downstream.

Intronic variants
Fourty-six intronic variants located at non-canonical positions of splice sites, of which 20 are novel, were detected     in the USH2A gene sequence. According to computational analysis, most of these novel variants were classified as possibly non-pathologic (UV2). (See Table 4).

Discussion
In the present study, we have performed a wide mutational screening of the USH2A gene in 88 unrelated Spanish patients diagnosed with Usher syndrome. This analysis has led us to identify a total of 37 different pathogenic mutations, 23 of which had not been previously described: six nonsense, eleven deletions/insertions, two missense, three splice-site mutations and one isocoding variant. At least one mutation was identified in 43 cases and the two responsible mutations were detected in 25 patients (five homozygous and 20 compound heterozygous cases).
The genotype-phenotype correlation for those patients bearing two mutations is illustrated in Table 5. Most cases presented with classical USH2 clinical features. But, interestingly, in one patient (RP-259), the sensorineural hearing loss was profound, RP started at the age of 6 years and he also had vestibular dysfunction (clinical findings typical for USH1). In another intriguing case, phenotype manifestations started at the age of 50 years (RP-1703). We cannot discard the possibility that additional changes in USH2A or in other USH genes, present in these patients, have some modifying effect on the phenotype [37,38].
It is complicated to predict the consequences of missense, silent and intronic changes, in order to discriminate neutral variants from those with a pathogenic effect. We have used a number of bioinformatics' tools to predict the damaging effect of these variants. However, we must bear in mind that these results are only computing predictions and additional studies are necessary to confirm the effect of those changes not clearly classified. In this sense, in vitro analyses for two variants located at non canonical splice sites which were predicted to affect the splicing (p.E2496E and p.V382M) showed that p.E2496E creates a de novo donor splice site stronger than the wild type site that leads to the loss of the last 106 nucleotides of exon 40. Thus, we have considered it as pathogenic. On the other hand, the presence of p.V382M revealed a mild increase of the transcript excluding exon 7 (Figure 1, band d) when the variant was present, but still, the normal transcript has a  SIFT: SIFT Score ranges from 0 to 1. The amino acid substitution is predicted to be damaging if the score is < 0.05, and tolerated if the score is 0/> 0.05. PolyPhen: "Probably damaging" (it is believed most likely to affect protein function or structure), "Possibly damaging" (it is believed to affect protein function or structure), "Benign" (most likely lacking any phenotypic effect). PMUT: NN output is the original output from the neural network for this mutation and its parameters. If that output is bigger than 0.5 it is predicted as pathogenic, otherwise as neutral.
stronger expression. For this reason, this change has been classified as UV3.
The majority of mutations were found once or twice. Only the c.2299delG mutation was identified in more than 5 alleles. However, the cohort in our study is biased, because those patients in whom two mutations were detected in previous analyses (study of exon 13, exons 2-21 or microarray analyses) were not included in this work. Actually, the allele frequency of the c.2299delG mutation in the Spanish population is 15%, which is lower, in any case, than in other populations [39]. Figure 2 shows the distribution of all the pathogenic mutations detected in the present study, along the different domains of the USH2A protein. Mutations are located evenly throughout the protein and no "hot spots" were observed. Interestingly, there are two domains in which mutations are not detected: the transmembrane and intracellular domains. None of the studies performed in USH2A have detected mutations in the intracytoplasmatic region, involved in the interaction of the USH2A protein with harmonin and whirlin.
There are more than 160 pathogenic variants described in previous studies. Noteworthy  [28]. However, we can also find similarities with other populations, like non-Ashkenazi Jews. The splice-site variant c.12062-2A > G was detected in three patients, in homozygous state in one of them. This mutation was initially described by Auslender et al. (2008) [26], as one of the most USH2A prevalent mutations in non-Ashkenazi Jews. Later, it was also detected in the American population [30]. We do not know the origin of our three patients, but it is tempting to speculate that they are descendant of those Sephardic Jews that were expelled from Spain in 1492 [40].
We did not find any mutation in 45 families while in 18 the second mutation remained unidentified. The number of detected pathogenic variants is probably underestimated, because there may be mutations in regions which have not been analyzed (introns, 3' and 5' untranslated regions (UTRs), promoter region, distant enhancers...) or large insertions, deletions and rearrangements that cannot be detected with the conventional PCR techniques. Moreover, some of these patients may have mutations in other genes like GPR98, which seems to be responsible for approximately 3-6% of USH2 cases [41,42] or DFNB31, although the studies indicate a minor role of DFNB31 in USH2 [43,44]. Furthermore, USH1 genes may be responsible for phenotypically USH2 patients. Jaijo et al. (2010) [32] found two mutations in CDH23 in two patients diagnosed as USH2 and a high phenotypic heterogeneity due to CDH23 variants has been reported [45,46].
In this report, we have detected at least one mutation in 48.9% (43/88) of total patients. Considering only the patients clearly diagnosed with Usher syndrome type II, the mutation detection ratio raises to 56.9% (33/58). This detection rate is lower than expected because, as it has been mentioned before, the patient sample included in this study is biased. Thus, if we take into account all our USH2 patients studied so far (including results from previous studies [ [19,9,23,32], unpublished data] and the present work), our database includes 102 typical USH2 patients with at least one mutation detected in the USH2A gene and 32 typical USH2 patients who have been studied for all exons of this gene and no mutation was found (Table 7). Thereby, our mutation detection rate rises considerably to 76.1% (102/134), making our percentage similar to those obtained by Baux et al. (2007) [25] and Dreyer et al. (2008) [28].