Causative variant profile of collagen VI-related dystrophy in Japan

Background Collagen VI-related dystrophy spans a clinical continuum from severe Ullrich congenital muscular dystrophy to milder Bethlem myopathy. This disease is caused by causative variants in COL6A1, COL6A2, or COL6A3. Most reported causative variants are de novo; therefore, to identify possible associated causative variants, comprehensive large cohort studies are required for different ethnicities. Methods We retrospectively reviewed clinical information, muscle histology, and genetic analyses from 147 Japanese patients representing 130 families, whose samples were sent for diagnosis to the National Center of Neurology and Psychiatry between July 1979 and January 2020. Genetic analyses were conducted by gene-based resequencing, targeted panel resequencing, and whole exome sequencing, in combination with cDNA analysis. Results Of a total of 130 families with 1–5 members with collagen VI-related dystrophy, 120 had mono-allelic and 10 had bi-allelic variants in COL6A1, COL6A2, or COL6A3. Among them, 60 variants were in COL6A1, 57 in COL6A2, and 23 in COL6A3, including 37 novel variants. Mono-allelic variants were classified into four groups: missense (69, 58%), splicing (40, 33%), small in-frame deletion (7, 6%), and large genomic deletion (4, 3%). Variants in the triple helical domains accounted for 88% (105/120) of all mono-allelic variants. Conclusions We report the causative variant profile of a large set of Japanese cases of collagen VI-related dystrophy. This dataset can be used as a reference to support genetic diagnosis and variant-specific treatment.

respiratory failure [1,10,11]. Muscle pathology encompasses variable histological changes including fiber size variation, an increased number of internal nuclei, and disproportionately prominent endomysial connective tissue considering the relative scarceness of necrotic and regenerating fibers [4,12]. We have previously reported two patterns of collagen VI distribution in muscles among patients: completely deficient (CD) or deficient on the sarcolemma but with deposits in the interstitium (sarcolemma-specific collagen VI deficiency: SSCD) [7,13].
The eventual diagnosis of this disease is made by genetic analysis. Before and in the era of next-generation sequencing (NGS), several studies have demonstrated a genetic spectrum in collagen VI-related dystrophy, showing that a distribution of variants is common across several ethnic backgrounds [7,11,[14][15][16]: the most common glycine substitution in the triple helical domain (THD), other missense variants, nonsense variants, splicing variants causing exon-skipping, small in-frame deletion/insertions, and small deletion/insertions causing a premature stop codon. Large genomic deletions spanning multiple exons are rare [10,[17][18][19]. Recently, a highly recurrent intronic variant in COL6A1 has been identified [20].
The aim of the present study was to elucidate the causative variant profile of collagen VI-related dystrophy in Japan by comprehensive genetic analysis including cDNA analysis, and to correlate the findings with immunostaining for collagen VI on muscle biopsies.

Results
We identified pathogenic variants in a total of 130 families with collagen VI-related dystrophy, which represented 1-5 members per family, seen at the National Center of Neurology and Psychiatry (NCNP) between July 1979 and January 2020, among them 120 families carried mono-allelic and 10 bi-allelic pathogenic variants (Table 1). One hundred and forty variants were identified, including 37 novel variants in 40 families, and these consisted of 60 allelic variants in COL6A1, 57 allelic variants in COL6A2, and 23 allelic variants in COL6A3 (Fig. 1). In 94 families with a mono-allelic variant, this was sporadic without family history (94/130, 72%). Among the 37 novel variants, we identified 24 missense variants, six splicing variants, three small in-frame deletions, three large deletions, and one nonsense variant (Fig. 2).
Among the ten families with bi-allelic variants, in eight the variants were in COL6A2, while the other two each had variants in COL6A1, or in COL6A3. Six of these ten families had variants producing a premature termination codon or causing aberrant splicing, which leads to inframe exon skipping in both alleles, and all had UCMD phenotypes. One of the ten families, #66, had a nonsense and a missense variant and also exhibited a UCMD phenotype. The affected individuals of the remaining three families had single nucleotide variants causing nonglycine substitutions and all showed BM phenotypes, although family #68 had a 26 bp-deletion causing a premature termination codon in one allele.
Three novel heterozygous multiple exon deletions were detected in four families (Fig. 3). The deletions spanned from exon 5 to exon 8 in COL6A1 (Family #3 and #4), from exon 8 to exon 10 in COL6A1 (Family #5), and from exon 8 to exon 10 in COL6A2 (Family #87). All these large deletions were in-frame and distributed in the THD.
We performed immunostaining for collagen VI in muscle biopsies from 125 affected individuals in 123 families. In 115 patients with a mono-allelic variant, 91% (92/101) with the variant within and 71% (10/14) with the variant outside the THD showed SSCD. Even the biopsies from families harboring multiple exon deletions showed the typical SSCD staining pattern, suggesting dominantnegative effect of those variants (Fig. 4). Among the ten families having bi-allelic variants, five showed a CD pattern, while the five families carrying missense variant(s) showed a SSCD or a normal pattern. Observation at high magnification using immunofluorescence staining revealed trace amounts of extracellular collagen VI in the muscle biopsies of three families with CD (Family #64, #67, and #109), while collagen VI was retained within the mesenchymal cells in two families (#61 and #62; Fig. 5).
We reviewed all available muscle imaging data (34 families including 23 cases and 24 cases tested by MRI and CT, respectively. Thirteen cases were tested by both modalities). At least one of three typical findings in collagen VI-related dystrophy (tigroid or outside in pattern in the vastus lateralis; target sign in the rectus femoris; a hyperintense rim between the soleus and gastrocnemius) [21] was seen in 85% (29/34) of the families. Among 29 families had mono-allelic variants in the THD, 86% (25/29) of these had typical imaging findings. Three in four families (75%) with a mono-allelic variant outside the THD. In families with bi-allelic variants, the imaging data was available in only family, who showed typical imaging findings.

Discussion
We have elucidated the causative variant profile of collagen VI-related dystrophy in Japan (Table 1). Furthermore, we report 37 novel variants in 40 families, comprising 24 missense, six splicing, three small in-frame deletion, three large genomic deletion, and one nonsense. From the genetic information, we have established the causative variant profile of the largest cohort at a single center as far as we are aware. The majority of the variants were mono-allelic (86%, 120/140), and 67% (94/140) of them were likely to be de novo because the parents of the patients were not apparently affected and their DNAs were not available, as has previously been described [11,14,15,[22][23][24]. Therefore, our causative variant profile may be useful as a reference for diverse ethnicities. Given that all cases with collagen VI-related dystrophy in this cohort were sent to our center from hospitals in Japan, we calculated the occurrence of severe UCMD in Japan as 1.63 cases per year and estimated that about 70% of collagen VI-related dystrophy were diagnosed at our center, which is an estimated incidence of 0.20 in 100,000 births, higher than that found for northern England (0.13/100,000) [9]. This is most likely because of the difference of the diagnostic system between the two countries.
Among the mono-allelic variants, 88% (105/120) were located in the THD. The association between monoallelic variants in the THD and the SSCD staining pattern (91%, 92/101) may be explained by the fact that tetramers containing dominant mutations in the THD are secreted but cause the impaired ability to form microfibrils and the reduced binding of collagen VI to extracellular matrix [25,26]. Furthermore, those mono-allelic variants in the THD are associated with UCMD or intermediate phenotype (82%, 86/105). In contrast, mono-allelic variants outside the THD were also associated with SSCD (71%, 10/14) but a BM phenotype (93%, 14/15) ( Table 2). However, as shown in the literatures, genotypes cannot be associated with specific phenotypes, with some variants reported to cause both UCMD and BM phenotypes [14][15][16]24]. In fact, in our cohort, the families with c.877G>A in COL6A1, c.856-2A>G in COL6A2, or c.943G>A in COL6A2 showed a wide range of phenotypes from milder BM to severer UCMD, while conversely the variation in phenotypes of families with c.956A>G or c.1022G>A in COL6A1 was quite narrow and those families showed BM or intermediate phenotypes.
In addition, we found four heterozygous large deletions in families with UCMD phenotype. All the deletions were located in the N-terminal side of the cysteine residue important for the assembly of the collagen VI tetramer. This is in accordance with all the reported multiple exon deletions [17,19,25,[27][28][29]. Intriguingly, the deletion in the region containing the cysteine residue caused relatively mild phenotypes in our cohort and in those of previous reports [11,[30][31][32]. This may be explainable by the mechanism that the loss of the distinctive cysteine residue causes the failure in dimer formation of the mutant COL6A1, resulted in the reduced normal COL6A1 dimer production into 1/4 in amount [31]. On the contrary, deletions of the entire COL6A2 are reported to show recessively acting loss of function variants [33]. Thus, collagen VI proteins with large genomic deletions in the N-terminal side of the THD, which have the deletions no more than 72 amino acid residues, may act in a dominant-negative fashion and show UCMD or intermediate phenotypes.
In this study, we identified ten families having bi-allelic variants and five and four families showed CD and SSCD collagen VI staining patterns in muscles, respectively. We can presume that families with truncated variants in both alleles will be associated with CD and severe UCMD phenotypes, whilst those with missense variants or inframe deletions at least in one allele will be associated with SSCD and milder BM phenotypes. In fact, three families with truncated variants in both alleles (CD) and five families with missense or in-frame deletion at least in one allele (SSCD) displayed compatible patterns with the aforementioned presumption, regardless of causative genes. Interestingly, the other two bi-allelic families had in-frame deletion(s) in one and in two alleles, but they showed CD and severe UCMD phenotypes. To explore the mechanism causing the loss of collagen VI in muscles in these families, we observed the trace of collagen VI remaining in their biopsied muscles. In muscles from patients with truncated variants in both alleles, collagen VI formed small deposits in the extracellular space, while in patients with an in-frame deletion in at least one allele, the collagen VI was retained within mesenchymal cells. Thus, we hypothesized that, from those cases with extracellular deposits visible, the truncated collagen VI molecules can form tetramers and be secreted, but the secreted collagen VI will be unstable and degraded extracellularly. On the other hand, in the cases with a retained trace, the in-frame deleted molecules failed to make a tetramer and be secreted. Additional detailed molecular analyses are required to understand the precise mechanism.
The multiple analyses (RNA analysis and immunostaining, reviewing the clinical information) were used for validation of pathogenicity of novel variants. For example, the patients with mono allelic THD variants showed missense or in-frame deletion in transcripts and SSCD staining pattern of collagen VI in muscles, and severe UCMD phenotype. In contrast, the patients with extra-THD variants showed SSCD staining pattern of collagen VI in muscles, and typically milder BM-phenotypes. This information is essentially compatible to the genotype-phenotype correlation in collagen VI-related dystrophy shown in previous reports and adds many examples. The cumulative information further contributes the establishment of the genotype-phenotype database in collagen VI-related dystrophy.

Conclusion
Our report provides a large causative variant catalog of collagen VI-related dystrophy in Japan, which can be used as a reference for genetic diagnosis and will also be helpful in variant-specific therapy in the future. The majority of causal variants of collagen VI-related dystrophy was mono-allelic de novo, and most of them were located in the THD and associated with SSCD and UCMD or intermediate phenotypes.

Clinical information
This retrospective cohort study was performed on patients seen at the NCNP, a major referral center for muscle disease in Japan, between July 1979 and January 2020. Frozen muscle and blood samples from patients were sent for diagnosis to the NCNP from all over Japan.
Clinically or pathologically suspected collagen VIrelated dystrophy with possible pathogenic variants in COL6A1, COL6A2, or COL6A3 was identified in 147 affected individuals in 130 families. Patients with collagen VI-related dystrophy were classified into three categories, UCMD, intermediate and BM, according to phenotypic stratification as previously described [4,28,34,35].
This study was approved by the institutional review boards of the NCNP. All the human materials used in this study were obtained for diagnostic purposes. The patients or their parents provided written informed consent for use of the samples for research.

Muscle histology
Muscle biopsy samples for histological examination were frozen in isopentane cooled in liquid nitrogen. A set of routine histochemical analyses was performed for diagnosis. When the patients were suspected of having collagen VI-related dystrophy or had elevated serum creatine kinase, immunohistochemistry was performed using standard procedures with an antibody against collagen type VI (VI-26, 1:1000; MP Biomedicals, LLC, Irvine, CA) as previously described [7]. Immunofluorescence staining using standard procedures was performed with antibodies against collagen type VI (VI-26, 1:500; MP Biomedicals), PDGFRα (1:500, Cell Signaling Technology, Danvers MA), and laminin α2 (4H8-2, 1:500; Santa Cruz, Dallas TX) [36].

Genetic analysis
Genomic DNA was isolated from peripheral blood lymphocytes or muscle specimens using standard techniques. All exons and their flanking intronic regions in COL6A1, COL6A2, and COL6A3 were amplified and sequenced directly in 52 families using an ABI PRISM 3130xl Genetic Analyzer (Applied Biosystems, Waltham, MA). Sixty-five families were analyzed using the target resequencing panel for muscular dystrophy because we developed a method for screening gene causative variant in our laboratory since 2014 using Ion PGM NGS [37]. Thirteen families were analyzed by whole exome sequencing because they were initially suspected of having other types of muscular disease. and #87, respectively. At the genomic level, Family #3 and #4 carried a deletion of 1.2 kb spanning from IVS4-7 to IVS8+490 in COL6A1 (a). The 5′ breakpoint of the 2.1 kb deletion found in Family #5 was located at the sixth base of exon 8 of COL6A1 and its 3′ breakpoint was at − 43 of intron 10 (b). One of the COL6A2 alleles of Family #87 contained a 1.2 kb deletion extending from IVS7+102 to IVS10-43 (c). E: exon sequence. The numbering of genomic positions at the breakpoints are based on the sequence from the Gene Reference Consortium GRCh37/hg19.
The splice site-creating variant Chr21:47,409,881 C>T in intron 11 of COL6A1, was manually screened by the Sanger method [20].

cDNA analysis
Total RNA was extracted from frozen muscle using a Total RNA Kit (Nippon Gene, Tokyo, Japan) and cDNA was synthesized with oligo (dT) 20 primer using Super-Script IV Reverse Transcriptase (Thermo Fisher Scientific, Waltham, MA) using standard techniques [13].  The highly sensitive detection of collagen VI in patients' muscles showing complete deficiency by routine immunostaining. The highly sensitive immunofluorescence staining for collagen VI (green), PDGFRα (red), and laminin α2 (blue) in muscles of patients showing complete collagen VI deficiency (a, Family #64; b, Family #67; c, Family #109; d, Family #61; e, Family #62). Scale bar, 10 μm. Highly magnified immunofluorescence images showed that collagen VI formed small deposits in the extracellular space in muscles from patients with truncated variants in both alleles (a-c), while in patients with an in-frame deletion in at least one allele, the collagen VI was retained within mesenchymal cells (d, e).