Recommendations on clinical trial design for treatment of Mucopolysaccharidosis Type III

Background Mucopolysaccharidosis type III is a progressive, neurodegenerative lysosomal storage disorder for which there is currently no effective therapy. Though numerous potential therapies are in development, there are several challenges to conducting clinical research in this area. We seek to make recommendations on the approach to clinical research in MPS III, including the selection of outcome measures and trial endpoints, in order to improve the quality and impact of research in this area. Results An international workshop involving academic researchers, clinical experts and industry groups was held in June 2015, with presentations and discussions on disease pathophysiology, biomarkers, potential therapies and clinical outcome measures. A set of recommendations was subsequently prepared by a working group and reviewed by all delegates. We present a series of 11 recommendations regarding the conduct of clinical research, outcome measures and management of natural history data in Mucopolysaccharidosis type III. Conclusions Improving the quality of clinical research in Mucopolysaccharidosis type III will require an open, collaborative and systematic approach between academic researchers, clinicians and industry. Natural history data should be published as soon as possible and ideally collated in a central repository. There should be agreement on outcome measures and instruments for evaluation of clinical outcomes to maximise the effectiveness of current and future clinical research.

The principal biochemical abnormality is the accumulation of lysosomal HS, but secondary storage products, particularly GM2 and GM3 gangliosides, are also thought to play a role in central nervous system (CNS) pathology [16,17]. CNS pathophysiology likely involves a multiplicity of mechanisms including neuroinflammation, defective autophagy and mitochondrial dysfunction [18][19][20][21][22].
Neurological disturbance dominates the clinical picture, typically described as progressing in three phases [23]. In the first phase, beginning at 1-4 years of age, there may only be delayed development, particularly of speech and language. The second phase begins at 3-5 years and is characterised by marked behavioural disturbance with aggression, hyperactivity and sleep disturbance, and the beginning of progressive cognitive decline [24][25][26][27][28][29]. Finally, in the third phase, beginning from around 10 years onward, there is progressive loss of motor function, eventually resulting in complete loss of ability to walk, swallowing and feeding difficulties often requiring gastrostomy feeding, and there may be seizures [24,28,29]. Death usually occurs in the second decade, except in attenuated patients.
The four subtypes are clinically indistinguishable, though progression of disease is possibly slower in MPS IIIC [27][28][29]. However, there is a great degree of variability both within and between subtypes [25,27]. Very attenuated phenotypes have been reported, usually in association with specific mutations [30][31][32][33][34]. Specific mutations associated with severe or attenuated phenotypes are well recognised in MPS IIIA [30] but not MPS IIIB.
Given the phenotypic heterogeneity, progressive nature of disease and small numbers of patients, understanding the natural history of the disease is essential for informing clinical trial design and selection of appropriate outcome measures. Natural history studies provide comparison data and potentially form historical control groups for interventional studies. Though several natural history studies have been undertaken, they differ in design and selection of parameters measured, and not all are published (Table 2). Earlier studies collected retrospective data, often based on parent/caregiver recollection and not methodologically robust enough to form comparator groups. To date, only two published studies have prospectively collected individual, longitudinal natural history data [57,58]. The majority of published detailed natural history data is for MPS IIIA.
Clinical trials are a necessity in MPS III given the lack of effective therapies. Phenotypic heterogeneity, small patient numbers, the multidisciplinary management of the disease, the multiplicity of potential approaches to treatment and the difficulty of assessing the effects of treatments targeted to the CNS all present challenges to conducting research in this area. There is therefore a need for consensus on how best to move forward with clinical research, collate natural history data, and select robust and clinically meaningful outcome measures for clinical trials.

Methods
An international workshop on MPS III clinical research was held in June 2015 to develop collaborations between academic research, clinical experts and industry groups, and work towards recommendations on conducting clinical research in MPS III. This was funded by the UK MPS Society and received no commercial grants. Invited delegates were members of all academic, clinical and industry groups known to be involved with a current or upcoming clinical research programme in MPS III, as well as representatives from the UK MPS Society. Experts included paediatricians for metabolic disorders, neuropsychologists and academic researchers with an acknowledged expertise in MPS III. The workshop programme and a full delegate list are included in supplementary information. Presentations included updates on research by academic, clinical and industry groups, and were followed by active discussion sessions with a view to developing recommendations for clinical trials. All groups of delegates participated in discussions. Recommendations were then formulated by a working group established at the workshop and fed back to all delegates for comment. Disagreements were resolved by discussion. Final recommendations represent a combination of published data and the experience of experts in MPS III. Further experience and an increasing evidence base may allow for consensus to be developed for neurocognitive endpoints using a formal Delphi method and a consensus conference with this goal has been organised (Elsa Shapiro, personal communication   The reduced lifespan in severely affected patients is likely to be related to progressive neurological deterioration, with most deaths being due to bronchopneumonia (Christine Lavery, personal communication). However, the involvement of multiple organ systems in MPS III is well recognised [59]. Most patients have somatic manifestations [5,60], the most frequent being coarse facial features and hepatomegaly. Recurrent ENT infections and diarrhoea are common symptoms and a significant number of patients undergo adenoidectomy or tonsillectomy [60]. Umbilical or inguinal hernias are frequently reported [25,29,30]. Multiple skeletal manifestations have been described, particularly hip dysplasia and osteonecrosis of the femoral head [23,61,62].
The relevance of somatic manifestations may become increasingly important with the development of CNSdirected therapies. Manifestations such as cardiac valvular abnormalities, hearing loss and retinopathy [60,63,64] may be a significant contribution to morbidity in attenuated patients living for longer with slowly progressive disease. Somatic features such as skeletal disease and sleep disordered breathing may also be potential confounders of CNS outcomes such as neurodevelopmental assessments, and should be accounted for [65].

Both cognitive and behavioural manifestations
should be considered in the development of clinical trial endpoints.
Cognitive and behavioural manifestations are both well recognised in MPS III, and, except for impaired acquisition of language, appear to develop within a similar timeframe. The onset of both delayed cognitive development and abnormal behaviours occurs between approximately 3-4 years of age on average, though there is a great degree of variability [2, 5, 24, 27-30, 58, 60, 64]. In MPS IIIA, adaptive behaviour remained intact for longer than cognitive function in one study [60], but in a more recent study, children continued to acquire adaptive behaviour skills for longer than cognitive skills, though both declined after around 4 years [58].
Though cognitive and behavioural manifestations are related, they are different and may impact differently on quality of life [66]. Cognitive impairment may also confound assessments of behaviour [67] and vice versa. Assessments of both cognitive function and behaviour should therefore be considered important outcome measures. However, in clinical trials where sample sizes are small, phenotypic variability may limit the applicability of cognitive and behavioural assessments as primary endpoints. Neurological symptoms such as seizures, ataxia and dystonia appear late in the disease course and therefore are of limited use as outcome measures, particularly where the goal is to assess response to early therapy.
3. Cognitive and behavioural manifestations are less evident below the age of 2 years. For these patients, cognitive and behavioural assessments should be used as long term assessments over a period of years following identification.
Cognitive manifestations are unlikely to become apparent before 2 years of age, other than a subtle slowing in the development of cognitive and language skills or failure to acquire language at all [24,30]. Children may continue to acquire skills beyond 2 years [57,58], and in patients with slowly progressive disease, manifestations may not become apparent for several years.
Where interventional trials involve children under 2 years of age, cognitive and behavioural assessments should therefore be used as long term outcome measures over a period of years. Development of shorterterm measures of treatment effects in this age group will be of value, though this needs to be coupled with a better understanding of genotype-phenotype correlations to predict slowly or rapidly progressive disease. One approach may be to determine whether children continue to acquire skills beyond an expected ceiling of development. In MPS IIIA, the ceiling of cognitive development is reported as 36 months in children with rapidly progressive phenotype [58]. In contrast, there is little published natural history data on cognitive development in MPS IIIB children under 6 years.

4.
Cognitive assessment should be used as a long-term endpoint, using the Bayley Scales of Infant and Toddler Development (3rd edition) and/or the Kaufman Assessment Battery for Children (2nd edition). The Vineland Adaptive Behaviour Scale is a useful additional measure, but cannot be the only developmental endpoint. We advocate the use of age equivalent scores to assess the relative rate of development.
Delayed development followed by progressive loss of skills is characteristic of MPS III and developmental endpoints are therefore important for evaluating the efficacy of potential treatments.
There are several challenges to administering cognitive assessments in this patient population. Motor ability, language regression, hearing or visual impairment, and emotional or behavioural factors may influence the child's performance, as may environmental factors such as recent medical procedures, general anaesthesia or sedation, or fatigue [27,65,[67][68][69]. Testers must be familiar with both the test and the disease and be experienced at testing behaviourally challenging children.
Selection of appropriate cognitive assessments presents a further difficulty. Individuals may span a range of abilities and ages, from normal to severely impaired, and from infancy to adulthood. A single instrument for cognitive assessment is therefore unlikely to be sufficient [67]. In clinical trials, this variability may be partly mitigated by selection and stratification of subjects to create a more homogeneous patient population. An appropriately sensitive measure can then be selected.
The scale should be appropriate for tracking longitudinal development. Test batteries should be short and focused to avoid fatigue, should not rely on verbal-based subtests and should be supplemented by parent reported measures. The difficulty level should be appropriate for the disease and up to date normative data should be available. In addition, the test should be available and familiar to researchers internationally and translated into several languages [65,67], as multi-centre studies are a necessity in this rare disease. Based on these criteria, we advocate the use of the Bayley Scales of Infant and Toddler Development, Third Edition (BSID-III) in children with an expected developmental age up to 42 months, or the Kaufman Assessment Battery for Children, Second Edition (KABC-II).
Finally, there are challenges in how data should be presented and interpreted. Historically, standard scores have been preferred to age equivalent scores (AgeEqSs) or developmental quotient (DQ) as the latter do not take into account the range in normality, and present challenges in statistical analysis as the intervals are unequal (the same change in raw score may represent quite different changes in AgeEqSs or DQ at different ages) [65,67]. However, standard scores may be less useful in this patient group. Given the nonlinear trajectory of development in MPS III, treatment effects on cognitive outcomes need to be assessed based on a developmental growth curve which cannot be calculated from standard scores. Standard scores also have a floor below which low-functioning children may fall, thereby reducing the sensitivity of the test for tracking longitudinal development [67]. We therefore advocate the use of AgeEqSs to assess the relative rate of development in MPS III. However, AgeEqSs are more difficult to manage statistically and DQ may allow for easier comparison between subjects.
Assessment of adaptive behaviour using the Vineland Adaptive Behaviour Scale, Second Edition (VABS-II) may be a useful additional measure of development. Due to the multiple challenges of assessment in this group, standardised cognitive assessments cannot always be administered and therefore, despite not being a true neurocognitive test, the VABS-II at least provides an account of development from parent report which can be corroborated from observation of the child in clinic. Behavioural manifestations include hyperactivity, aggression, temper tantrums, unusual affect and orality [70]. Sleep disturbance may be considered part of the behavioural phenotype and correlates with behavioural disturbance during the daytime [70,71]. Elements of the behavioural phenotype overlap with features of autism spectrum disorders, in particular impaired social communication, reported in both MPS IIIA and IIIB [72,73] and in some cases this has led to the misdiagnosis of MPS III as autism [26]. Behavioural manifestations have a huge impact on families of children with MPS III, who often consider these to be more challenging than physical symptoms [66].
Behavioural manifestations appear at a specific point in the disease course and certain behaviours such as hyperactivity and aggression may decline with disease progression. Behavioural disturbance may also be less apparent in attenuated patients. Robust natural history data on behavioural outcome measures is therefore required for these to be used as endpoints, and not enough data currently exists for this to be recommended as a primary endpoint.
The Sanfilippo Behaviour Rating Scale (SBRS) has been developed as a disease specific measure designed to track behavioural changes through different stages of disease, and to encompass the full range of behaviours seen in MPS III [74]. These include orality and dampened emotional expression, including lack of fear, which appear to correlate with structural brain changes, specifically amygdala volume [74,75]. Further experience and validation of SBRS in Sanfilippo patients would be beneficial.
Actigraphy measures level of activity over time and allows discrimination between states of sleep and wakefulness [76]. The circadian rhythm in MPS III appears to be fragmented, with a phase delayed sleep-wake cycle in some children [77]. Children have higher levels of activity in the early morning and abnormalities in endogenous melatonin production [76][77][78]. Though the relationship between actigraphic data and daytime behaviour needs to be further examined, actigraphy allows measurement of sleep and circadian rhythm functioning, and may be a useful additional measure in clinical trials. MPS III has a significant impact on the lives of a child and their family. In addition to the symptoms already described, chronic pain is increasingly recognised as a feature [79]. Families have described having a child with MPS III as 'devastating, ' with significant emotional and financial impact, and stresses on marital, social and family relationships [80]. Sleep disturbance, agitation, repetitive behaviours and diarrhoea are reported to be the most frequent and challenging symptoms for families to deal with, and behavioural symptoms can be relentless, resulting in anger, frustration and mental and physical exhaustion [66]. Though the impact of behavioural symptoms may decrease as the disease progresses, intellectual and physical ability continues to deteriorate and may affect parental psychological functioning [81]. Parents of children with MPS III may also be at risk for problems with anxiety and depression [81].

MPS
Currently available, non-disease specific, standardized quality of life measures (QoL) are likely to be inadequate given the changing nature of factors contributing to quality of life during the disease process, and have so far proved inadequate in other MPS trials. Scales such as SBRS and the MPS III disability scale [82] are not fully validated and may not reflect some aspects of QoL. Activities of daily living, though not a QoL measure, may have functional relevance (independent eating, ambulation etc.) and therefore measures of adaptive behaviour such as VABS-II may be useful surrogates, as they include socialization and daily living measures. In one study, assessments of parental anxiety and depression using the Beck Depression Inventory (BDI) were used as surrogate markers of the effects of risperidone on behaviour [83] in an ongoing clinical trial of genistein in MPS III such measures enabled the identification of parents with clinical levels of psychological distress (Stewart Rust, personal communication). Assessment of parental mood using the BDI-II may therefore be useful.
QoL measures are of interest to authorities making funding decisions regarding potential MPS III therapies and there is therefore a need for disease-specific QoL measures to be developed or to select QoL measures sensitive to MPS III. Such measures need to directly assess the dimensions that parents or caretakers feel are important. While these measures will likely not be sufficiently sensitive as primary endpoints, it is critical that parents feel the applied treatment have a meaningful and measureable impact for the patient and family. Developing such a measure requires considerable resources.
A joint effort that includes academic, clinical, patient groups and industry should be considered. Certain research groups are also developing qualitative research, using parent interviewing, to capture meaningful aspects of the disease and how it affects patient and family life (Samantha Parker, personal communication).
7. Measures of CSF heparan sulfate are generally agreed to be the best current biomarker for neurological disease, though there is continuing debate regarding the best methods of measurement of heparan sulfate and heparan sulfate derived structures. While it is possible to measure heparan sulfate in blood and urine, these have less relevance to neurological outcomes in clinical trials.
Due to the slowly progressive nature of MPS III, small numbers of patients, clinical heterogeneity, and the difficulties of quantifying neuropsychological outcomes, biomarkers are critical to the development of new therapies. An ideal biomarkers is easily and reliably measurable, correlates closely with disease burden and relevant clinico-pathological parameters, and responds rapidly to treatment [84]. In MPS III there is a need for biomarkers to reflect CNS disease progression in particular. Biomarkers may be the best available measure of short-term treatment effects, particularly during the short time period of a clinical trial or in early treated patients, and given the difficulties associated with cognitive outcome measures. Urinary glycosaminoglycans (uGAG), while commonly used in MPS disorders, may differentially reflect GAG burdens in different organs, and, due to the blood-brain barrier, may not effectively reflect burden of disease in the brain. Measurement of cerebrospinal fluid (CSF) HS is a logical option to consider as HS is the primary storage metabolite in MPS III. CSF HS has been shown to respond to treatments in MPS I [85]. There is ongoing debate about which of the multiple methods of quantitating HS and HS-derived structures [86][87][88][89] may be the most suitable. Obtaining CSF samples is invasive, requiring lumbar puncture, but CSF HS is more likely to reflect neurological disease than urine or plasma HS. However, although CSF HS may reflect substrate concentrations in intrathecal spaces, it will still be only a surrogate measurement of substrate storage in brain tissue. Recent work in large animal models suggests that CSF HS reduction after intra-CSF enzyme treatment was greater in brain cortex than in deeper brain structures [90]. Its use as a biomarker may therefore be of more relevance for interventions not administered to the CSF.
Surrogate GAG biomarkers that are directly dependent on HS levels are also possibilities. For example, heparin cofactor II thrombin complex [91] has some value in responses to treatment in MPS I, but is more dependent on dermatan sulfate for complex formation than HS [92] and although it can be used to distinguish MPS III patients [93], it will likely be more limited than direct measurements of HS.
A recently published natural history study suggests that CSF enzyme activity in untreated MPS III patients is lower than in controls, and this appears to be more discriminatory for MPS IIIB than for MPS IIIA [57]. Changes in CSF enzyme activity may be a relevant marker in certain trials, such as gene therapy trials where it may reflect the degree of expression of therapeutic protein in the brain [94], or systemically delivered ERT where it may reflect the degree to which enzyme crosses the blood-brain barrier. However, it is less likely to be useful for interventions such as intrathecal ERTs or SRTs. Ultimately, however, CSF enzyme activity does not reflect the disease modifying effect of a therapy and CSF HS may be more meaningful in this regard.

Further characterisation of the associations between
biomarkers and clinical outcomes in MPS III should be a goal of both natural history studies and longterm follow up of interventional trials, preferably in prospective studies.
Though many putative biomarkers for MPS III exist, uncertainties remain as to how accurately they correspond to disease outcomes and response to therapy. Detailed characterization of associations with clinical outcomes will be required for both existing and novel biomarkers. This has been demonstrated effectively in retrospective studies in MPS I with relationships established between uGAG and plasma enzyme level after treatment [95], urine and plasma HS against uGAG after treatment [89] and urine and plasma HS against both enzyme and clinical outcomes such as sleep disordered breathing [96].
9. Neuroimaging, such as measurement of cortical volume by MRI, is associated with deterioration in neurocognitive outcomes but is more difficult in younger patients and may at best be a useful secondary endpoint.
A range of MRI brain abnormalities have been described, including cortical atrophy and increased ventricular volume [97]. Quantitative methods such as volumetric analyses may be functionally relevant. Reduction of amygdala volume and hippocampal volume has been reported to be associated with lack of fear and social/emotional dysfunction respectively [74]. Cortical grey matter volume loss is associated with a decline in DQ [58] and may be an appropriate marker of disease-related brain changes, and a useful secondary endpoint in trials. This analysis can be automated but must be supplemented with manual editing due to inaccuracies due to lack of clear gray-white differentiation. As gray matter volumes cannot be accurately analyzed for children under 2 years of age, ventricular volume, which tends to increase over time [58], may be a useful measure.
10.Given the progressive nature of MPS III, lack of effective therapies and small numbers of patients, there are significant challenges to conducting interventional trials with formal comparator groups. A detailed understanding of the natural history in this disease is therefore essential.
Conducting interventional clinical trials with a formal comparator group presents challenges in MPS III. Given the lack of effective therapies, families may be anxious about being allocated to a non-interventional arm of a study. One method of overcoming this is to use a partial randomized crossover design, either by including an open label treatment phase in the study or allowing participants who completed a primary study with a comparator group to enroll in an open label long-term extension study [56,98]. However, disease progression during the initial part of the study may remain a source of anxiety for families. Where interventions are invasive (e.g. brain injections, intrathecal therapies requiring access device implantation), a formal 'placebo' approach may not be appropriate. This can be partly mitigated by using a 'no treatment' comparator group where only the assessors of the primary endpoint are blinded, as were the testers for neuropsychological assessments in a randomized controlled trial of intrathecal ERT in MPS II [98]. However, clinical variability and small numbers of patients limits the utility of comparator groups. A more detailed understanding of the natural history of individual subtypes of MPS III, together with consensus on what outcomes should be measured, may enable future interventional trials to be conducted without the need for formal comparator groups, as has previously been possible with other lysosomal disorders [99,100]. Good quality natural history data is vital to the development of clinical trial designs and analysis. Given the multiplicity of potential treatment approaches, the conduct of individual natural history studies by multiple groups both delays the evaluation of therapeutic interventions and reduplicates efforts, both of which will have a considerable impact on families deciding on the best course of action for their child. In addition, conducting natural history studies or interventional studies with placebo or no treatment comparator groups may no longer be possible once the first effective treatments become available. Cohorts of good quality published data are therefore required now. Natural history data should be published in a timely manner after completion and ideally in a central repositorypreferably held in a "neutral" academic setting or by patient organisations. Interventional trials should similarly be published within 1 year of study completion. Industry and academic groups must collaborate to facilitate these goals.

Discussion
There are considerable challenges to conducting clinical research in MPS III. In a progressive neurodegenerative condition, it is far from clear what the criteria for an effective therapy should be: reversal of decline, a steady state, or slowing of progressive decline? (Fig. 1). It is unlikely that any one marker of disease will be sufficient to evaluate this. Comparisons to MPS I may be relevant, as this also involves HS storage and neurological decline in severely affected individuals, but there is an effective CNS treatment (HSCT). HSCT can prevent further neurological decline in MPS I, and children continue to develop skills, albeit at a slower rate. This could therefore be considered the goal of an ideal therapy for MPS III. In MPS I, it is evident that early treatment minimises residual disease burden and improves both neurological and somatic long-term outcomes [101]. Given the likelihood that neurological disease is similarly irreversible in MPS III, early treatment is to be recommended, and this would be consistent with observations in mouse models [102]. As MPS III is often not diagnosed before the onset of neurological decline, the establishment and implementation of a newborn screening programme may be the only feasible way to achieve this. The ability to identify children with MPS III early would enable better opportunities for informed disease management and clinical trials participation.
Due to the slowly progressive nature of the disease, assessing the clinical outcomes of interventions within the relatively short time frame of clinical trials can be difficult. Long-term follow-up studies are valuable but given the numerous potential treatment strategies in development and the lack of effective therapy, there is a need for development of short-term predictors of longer-term outcomes, including the detailed characterisation of the association between disease biomarkers and clinical outcomes.
The wide clinical heterogeneity of the condition, combined with the relatively small numbers of patients presents difficulties in interpretation and statistical analysis of data. One approach to mitigating this in clinical trials may be through subject selection and stratification to create a more homogenous patient population. Ideally MPS IIIA, B, C and D should be considered separately, though in one ongoing trial where this was not possible, MPS IIIC patients were stratified separately (due to known slower progression of disease in MPS IIIC) [56]. Individuals likely to have rapidly progressive disease should be considered together, and this may be more feasible in MPS IIIA where specific mutations associated with rapid progression are recognised. However, a more detailed understanding of genotype-phenotype correlations is required with respect to other MPS III subtypes and slowly progressive phenotypes.
Evaluating and quantifying clinical outcomes in MPS III can be difficult, particularly neuropsychological assessments, and this is compounded by clinical heterogeneity and small numbers of patients. There is therefore a need for consensus over which outcome measures and instruments should be used in clinical trials. We have outlined certain broad recommendations here but this could be further developed by the construction of a 'core outcome set' for MPS III, i.e. an agreed minimum set of outcomes that should be measured and reported in all clinical trials [103]. This standardised approach potentially allows for robust meta-analysis in future clinical research which will be of considerable value in such a rare disease.
Conducting effective clinical research in MPS III requires a body of robust and quantitative natural history data. Though the natural history of MPS III has been well described in a narrative sense, it is only recently that good quality, prospectively collected longitudinal data has been published. Timely publication of natural history studies, sharing of data and a consistency of approach to outcome measures, potentially allowing for meta-analysis, will be essential to improve the quality of research in this field.

Conclusions
There is a pressing need to move forward with clinical research in a disease with huge impact on quality of life for affected children and families for which there remains no effective therapy. To do so most effectively will require an open, systematic and collaborative approach from academic groups, clinicians, patient groups and industry. As part of this process, we have made a series of recommendations on the conduct of clinical research and selection of outcome measures, and we emphasize the importance of timely publication and sharing of natural history data.