Outcomes in pediatric studies of medium-chain acyl-coA dehydrogenase (MCAD) deficiency and phenylketonuria (PKU): a review

Background Inherited metabolic diseases (IMDs) are a group of individually rare single-gene diseases. For many IMDs, there is a paucity of high-quality evidence that evaluates the effectiveness of clinical interventions. Clinical effectiveness trials of IMD interventions could be supported through the development of core outcome sets (COSs), a recommended minimum set of standardized, high-quality outcomes and associated outcome measurement instruments to be incorporated by all trials in an area of study. We began the process of establishing pediatric COSs for two IMDs, medium-chain acyl-CoA dehydrogenase (MCAD) deficiency and phenylketonuria (PKU), by reviewing published literature to describe outcomes reported by authors, identify heterogeneity in outcomes across studies, and assemble a candidate list of outcomes. Methods We used a comprehensive search strategy to identify primary studies and guidelines relevant to children with MCAD deficiency and PKU, extracting study characteristics and outcome information from eligible studies including outcome measurement instruments for select outcomes. Informed by an established framework and a previously published pediatric COS, outcomes were grouped into five, mutually-exclusive, a priori core areas: growth and development, life impact, pathophysiological manifestations, resource use, and death. Results For MCAD deficiency, we identified 83 outcomes from 52 articles. The most frequently represented core area was pathophysiological manifestations, with 33 outcomes reported in 29/52 articles (56%). Death was the most frequently reported outcome. One-third of outcomes were reported by a single study. The most diversely measured outcome was cognition and intelligence/IQ for which eight unique measurement instruments were reported among 14 articles. For PKU, we identified 97 outcomes from 343 articles. The most frequently represented core area was pathophysiological manifestations with 31 outcomes reported in 281/343 articles (82%). Phenylalanine concentration was the most frequently reported outcome. Sixteen percent of outcomes were reported by a single study. Similar to MCAD deficiency, the most diversely measured PKU outcome was cognition and intelligence/IQ with 39 different instruments reported among 82 articles. Conclusions Heterogeneity of reported outcomes and outcome measurement instruments across published studies for both MCAD deficiency and PKU highlights the need for COSs for these diseases, to promote the use of meaningful outcomes and facilitate comparisons across studies.


(Continued from previous page)
Results: For MCAD deficiency, we identified 83 outcomes from 52 articles. The most frequently represented core area was pathophysiological manifestations, with 33 outcomes reported in 29/52 articles (56%). Death was the most frequently reported outcome. One-third of outcomes were reported by a single study. The most diversely measured outcome was cognition and intelligence/IQ for which eight unique measurement instruments were reported among 14 articles. For PKU, we identified 97 outcomes from 343 articles. The most frequently represented core area was pathophysiological manifestations with 31 outcomes reported in 281/343 articles (82%). Phenylalanine concentration was the most frequently reported outcome. Sixteen percent of outcomes were reported by a single study. Similar to MCAD deficiency, the most diversely measured PKU outcome was cognition and intelligence/IQ with 39 different instruments reported among 82 articles. Conclusions: Heterogeneity of reported outcomes and outcome measurement instruments across published studies for both MCAD deficiency and PKU highlights the need for COSs for these diseases, to promote the use of meaningful outcomes and facilitate comparisons across studies.
Keywords: PKU, MCAD deficiency, Core outcome sets, Rare diseases, Patient-oriented outcomes Background Inherited metabolic diseases (IMD) are a large group of single-gene diseases that are individually rare but when aggregated have an estimated global birth prevalence of 50.9 in 100,000 live births [1]. These diseases are typically diagnosed early in life, often involve complex and resource-intensive medical care [2], and are frequently associated with intense home management and caregiving needs [3]. Providing effective treatment can be difficult due to a scarcity of evidence supporting current therapies [4]. When properly conducted, randomized controlled trials are considered the 'gold standard' primary study design for evaluating interventions [5]. However, trials have not always focused on those outcomes that are most relevant to patients diagnosed with the diseases being studied [6], and different trials within a single area of research have often incorporated disparate outcomes and outcome measurement instruments, thus impeding comparisons among studies and limiting capacity for data synthesis [7]. In response to these challenges, the Core Outcome Measures in Effectiveness Trials (COMET) Initiative [8] has led researchers in many disease areas to develop core outcome sets (COSs) [7]. A COS is a recommended minimum set of standardized, high-quality outcomes and associated outcome measurement instruments to be incorporated by all trials in an area of study [7]. COSs are developed to be relevant to all stakeholders in an area of research, including patients and their families, health care providers, and health policy decision-makers. Development and uptake of COSs aim to ensure that results can be synthesized and compared across studies.
It may be particularly valuable to develop COSs for IMDs and other rare diseases. Trials are less common and more challenging to implement for rare diseases relative to common diseases in part due to the difficulties in assembling a large enough cohort of patients to obtain adequate statistical power [9,10]. Consequently, there may be great interest in comparing and synthesizing the results of all trials of one or more interventions for a rare disease (for example, using systematic reviews and meta-analyses) when considering evidence to support treatment and policy decisions. Consistent selection of outcomes and outcome measurement instruments across effectiveness trials would facilitate such evidence synthesis and make best use of the limited resources that are available for rigorous evaluative research for rare diseases. This is especially important for IMDs given the current rapid pace of development of new therapies [11], which has resulted in an increasing need for timely evidence regarding the effectiveness and comparative effectiveness of emerging and existing treatments. Given that clinical trial outcomes can determine the evidence considered by patients, clinicians, and policy advisors when making patient care and health policy decisions, the outcomes measured in future trials of interventions for IMDs should be carefully considered.
In the present study we sought to comprehensively review outcomes reported in previously published pediatric studies related to two of the most common IMDs, medium-chain acyl-coA dehydrogenase (MCAD) deficiency and phenylketonuria (PKU). For both MCAD deficiency and PKU, there are no existing COSs, limited evidence is available regarding new treatments, and there is a scarcity of trials reporting patient-oriented outcomes [12][13][14][15][16]. Our specific aims were to: (i) identify the unique outcomes that have been reported or recommended in the literature for these diseases; (ii) understand the scope and variation in outcomes reported and how they are measured; and (iii) create a list of candidate core outcomes to inform the development of pediatric COSs for MCAD deficiency and PKU [7]. We hypothesized that our review would identify substantial variation in the reporting of outcomes and outcome measurement instruments in the published MCAD deficiency and PKU literature.

Protocol and registration
The study protocol was developed in collaboration with patient partners (MS, NP), registered in PROSPERO (CRD42017073524), and published [17]. This review is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (see Additional file 1) [18].

Search strategies and information sources
The study team developed separate search strategies to identify publications related to MCAD deficiency and PKU (further details, Additional file 1). Briefly, using the OVID platform, we searched Ovid MEDLINE® and Embase Classic+Embase. We also searched Cochrane Library databases. Language and publication year filters were applied to these searches to screen out non-English language studies, for practical reasons, and studies published before 1990 as the approach to diagnosing and managing these diseases has changed over time. We further searched the MEDLINE and EMBASE databases for publications focusing on long-term follow-up initiatives for the evaluation of newborn screening programs, as MCAD deficiency and PKU are among the most common targets of such programs [19]. Because of the anticipated low sensitivity of the database search for newborn screening evaluation initiatives, we conducted supplemental citation and related articles searches using key articles identified by the study team. Finally, we performed a grey literature search to identify further articles reporting or discussing outcomes for MCAD deficiency or PKU, guided by the Grey Matters tool [20]. Due to resource constraints, we limited the time devoted to grey literature searches to 15 h and prioritized sources that the study team deemed most relevant (see Additional file 1). As a further step in the grey literature search, we searched the COMET database [8] to identify COSs for pediatric conditions other than PKU and MCAD deficiency.

Eligibility criteria
We screened retrieved citations against a priori eligibility criteria that were developed following the PICOS statement (population, intervention, comparator, outcome, study design). To be eligible for inclusion, articles had to be focused on populations of children (aged 18 years or younger) with MCAD deficiency or PKU. There were no criteria specified for interventions or comparators. Articles had to be primary studies reporting on five or more children; treatment guidelines or recommendations for outcomes to be measured in future studies; or COS studies (from the COMET database) for other pediatric health conditions. The decision to restrict to primary studies or recommendations was added post hoc to exclude reviews of primary studies that would have already been captured by our search strategy. We also made a post hoc decision to include pediatric COS studies identified from the COMET database only if they included potentially relevant outcomes not already captured from other sources. All eligible articles had to report or discuss one or more outcomes in relation to pediatric PKU, MCAD deficiency, and/or the long-term follow-up of newborn screened diseases. We used a modified version of the Outcome Measures in Rheumatology (OMER-ACT) filter 2.0 definition to define an outcome as a result that is amenable to change due to the effects of a health intervention [21]. We excluded citations that were published abstracts only.

Study selection
We performed two levels of screening for all articles retrieved by our searches. For the peer-reviewed electronic database searches, level one (title/abstract) screening was conducted by two independent reviewers (among AR, KT, MP). We used a liberal accelerated screening approach, whereby a single reviewer needed to classify a citation as potentially eligible in order for it to advance to the next level but two reviewers had to independently exclude a citation in order for it to be removed. For the supplemental long-term follow-up, grey literature, and COMET database searches, a single author (MP) performed level one screening. For all searches, level two (full text) screening was conducted by two independent reviewers (among AR, KT, MP), who were required to fully agree on the inclusion or exclusion of an article; conflicts were resolved by consensus discussion or with a third team member.

Data extraction and synthesis
One reviewer extracted outcomes from eligible studies into an electronic spreadsheet and a second reviewer verified the extracted data. Data collection included information about study characteristics, outcome names and definitions exactly as described by the study authors for outcomes meeting the modified OMERACT definition as described above, and outcome measurement instruments when these were specified by the authors.
Through group discussion among a subset of study investigators with clinical and methodological expertise (JI, PC, MTG, BKP), we combined and renamed outcomes that were conceptually similar to arrive at a set of 'unique' outcomes [22]. Outcomes representing a subconcept of a broader outcome were kept separate when the outcome was thought to be particularly clinically important and was also perceived as a plausible trial outcome that could be selected over the broader outcome. Group discussion led to the creation of outcome domains to further group related outcomes based on an overarching concept (for example, physical growth and anthropometry, child life impact, health services use and costs) [7].

Assignment of outcomes to core areas
We mapped outcome domains onto one of five, mutuallyexclusive a priori defined core areas, four of which were described in the OMERACT Filter 2.0 framework, a literature-informed and consensus-developed process for establishing core outcome sets endorsed by COMET: death, life impact, pathophysiological manifestations, and resource use [7,21]. The fifth core area of growth and development was included due to our focus on children, following the approach used in a previous pediatric COS study [22]. Mapping of domains onto core areas was achieved through group discussion and was based on the alignment of the domain's overarching theme with the description of the core areas from the original sources [21,22]. The death core area covers general, disease-specific, and intervention-specific causes of death. The life impact core area includes quality of life, patient perceptions of health, psychosocial impact, and secondary impacts on caregivers. Resource use covers the direct and indirect economic impact of the disease on an individual and society. The pathophysiological manifestations core area covers the physiological and biochemical impact of the disease on the body's organs and function, and also includes disease biomarkers [21]. The growth and development core area incorporates outcomes measuring the impact of the disease on the physical growth and cognitive development of the child [22].

Identification of studies
The initial MCAD deficiency, PKU, and newborn screening long-term follow-up initiative database search strategies identified 6072 citations (Fig. 1). From these, 566 studies were determined to be eligible for data extraction. Due to time restraints and the large number of studies that were eligible for PKU, we decided to limit the initial data extraction for articles retrieved in the PKU search to only those that were published in the year 2001 or more recently; we then reviewed outcomes from studies from each previous year in turn, starting with 2000, until no new unique outcomes for PKU were identified. We initially extracted data from 308 eligible PKUrelevant articles published between 2001 and 2017. A review of an additional 16 PKU-relevant articles published in 2000 did not identify any additional unique outcomes so that PKU-relevant studies were excluded when published before the year 2000 (see Additional file 2). Only one of the 18 eligible articles from the COMET search reported a new relevant outcome that had not been identified by studies from the other searches and was retained for data synthesis. In total, 378 articles were included in our review (see Additional file 3).

Outcomes for MCAD deficiency Study characteristics
Our search strategies identified a total of 52 articles relevant to MCAD deficiency: 35 MCAD deficiency studies were identified from our disease-specific database searches, 16 from newborn screening long-term follow-up studies, and one from a previously published pediatric COS. Most articles were published after 2009 (58%) and reported on observational studies (85%) ( Table 1). With respect to participant ages in primary studies, the focus was commonly on newborns only (45%) or children (40%), with a minority of studies including children and adults. Articles reported or discussed a median of four unique outcomes (range = 1-28) with most including five or fewer outcomes per article (67%).

MCAD deficiency outcomes within domains and core areas
Data extraction initially yielded 230 MCAD deficiency outcomes. The study team combined outcomes covering similar concepts, yielding 83 unique MCAD deficiency outcomes ( Table 2). Unique outcomes were grouped into 10 domains within the core areas. There was a relatively even distribution of studies measuring outcomes across the core areas for MCAD deficiency. The most frequently reported and most diverse core area was pathophysiological manifestations, which contained three domains and 33 unique outcomes reported by 29/ 52 articles (56%).
Death was the most commonly reported or discussed unique outcome for MCAD deficiency (24/52 articles, 46%) ( Table 2). The next most frequently reported outcomes were cognition and intelligence/IQ, hospitalization, metabolic decompensation, and overall child development, which were each included in 14 articles (27% of articles). Among the other frequently reported outcomes (reported in more than five studies), four were in the pathophysiological manifestations core area (three in the acute disease-specific manifestations domain, one in the non disease-specific symptoms and disorders domain), two were in the growth and development core area (one in the cognition and development domain, and one in the physical growth and anthropometry domain), two in the resource use core area in the health service use and costs domain, and one in the life impact core area under the child and caregiver/family life impact domain. One-third of the outcomes (27/83 outcomes) were reported by a single article.

Changes over time to reported outcomes for MCAD deficiency
We observed changes over time in the frequency of reporting for some MCAD deficiency outcomes identified in our review ( Table 2)

Outcome measurement instruments for selected MCAD deficiency outcomes
We summarized data for outcome measurement instruments associated with neuro-psychological outcomes and/or outcomes that were typically measured using self-or parent-reported questionnaires. Among 25 articles measuring such outcomes for MCAD deficiency, we identified 11 outcome measurement instruments associated with 11 unique outcomes (see Additional file 4). The most diversely measured outcome was cognition and intelligence/IQ: eight unique instruments were reported by 14 articles, with the Wechsler Intelligence Scales family of measurement instruments [23] being the most frequently specified (3/14 articles, 21%). The only other outcome with more than one reported measurement instrument was sensorimotor and motor functioning with four unique tools used in seven articles. One measurement instrument was specified for each of caregiver/family psychosocial well-being, behaviour problems and externalizing mental health or behaviours disorders, internalizing behaviour, and overall child development.
Studies reporting child quality of life, parental experiences with illness care and prevention, attention-deficit hyperactivity disorder (ADHD) or ADHD-like symptoms, autism spectrum disorder (ASD) or ASD-like symptoms, and tic disorder were unclear about which outcome measurement instruments were used. There were not enough data available to assess any changes over time in how outcomes were measured for MCAD deficiency.

Outcomes for PKU Study characteristics
PKU data synthesis included 343 articles consisting of 326 PKU-specific articles, 16 newborn screening longterm follow-up studies, and one pediatric COS. Similar to MCAD deficiency, over half of the articles included for PKU data synthesis were published after 2009 (56%) and mainly consisted of observational studies (83%) ( Table 1). A majority of articles were focused on children (56%), with smaller proportions including children and adults combined (32%) or newborns only (11%). A number of studies clearly included children but did not specify the age ranges of the included participants and, as described for MCAD deficiency, further results were not broken down by age group. Articles reported or discussed a median of three unique outcomes (range = 1-25), with the majority including five or fewer outcomes (76%).

PKU outcomes within domains and core areas
Data extraction initially yielded 565 outcomes. The study team combined outcomes covering similar concepts into 97 unique PKU outcomes (Table 3). Unique outcomes were grouped into 11 domains within the five core areas. The most frequently represented core area was pathophysiological manifestations, which contained two domains and 31 unique outcomes reported by 281/343 articles (82%). The most diverse core area was life impact with five domains and 44 unique outcomes reported by 156/343 articles (45%). Phenylalanine concentration in the blood and other tissues was the most common unique outcome reported or discussed among the articles included in our review (228/ 343 articles, 66%). The next most frequently reported outcomes were cognition and intelligence/IQ (82/343 articles, 24%) and metabolic syndrome/energy metabolism (52/343   articles, 15%). Among the 10 most frequently reported unique outcomes, five were within the pathophysiological manifestations core area (two in the monitoring of disease-specific biomarkers and surrogate outcomes domain, three in the monitoring of non disease-specific biomarkers and surrogate outcomes domain), four were within the growth and development core area (three from the physical growth and anthropometry domain, one from cognition and development), and one from the life impact core area (from the disease management and feeding behaviour domain). Sixteen percent of outcomes (16/97 outcomes) were reported by a single study.

Changes over time to reported outcomes for PKU
We observed changes over time in the frequency of reporting for some unique outcomes identified in the reviewed studies related to PKU (

Discussion
We reviewed pediatric literature related to MCAD deficiency and PKU to identify the scope of reported and recommended outcomes. Similar to reviews of outcomes reporting in other clinical areas [26][27][28], we found substantial diversity of outcomes reported across the five core areas of outcome measurement for both diseases.
Notably, almost a third of outcomes for MCAD deficiency and over 15% for PKU were reported in only a single study. With little overlap of outcomes across studies, this suggests limited potential for combining or comparing results across published studies for these diseases and highlights the potential value of developing COSs for MCAD deficiency and PKU. For MCAD deficiency, there was relatively equal representation of each of the five core areas: approximately half of published studies (46-56%) incorporated outcomes within each core area with the exception of resource use (35%). The most frequently reported outcomes for MCAD deficiency were focused on the risk of life-threatening consequences and manifestations of the disease. This emphasis on mortality-related outcomes is seen in COSs for other potentially life-threatening conditions like post-partum haemorrhage [29] and fetal growth restriction [30]. By contrast, for PKU, there was a dominance of the pathophysiological manifestations core area, similar to that seen in a review of outcomes for type II diabetes [28], and for children with feeding tubes and neurological impairments [22]. The focus on pathophysiological manifestations for PKU reflects frequent reporting of the specific outcome blood phenylalanine concentration. Blood phenylalanine is a wellestablished surrogate indicator of clinical symptoms for PKU and has been used as a marker of treatment adherence in treatment guidelines and as a clinical trial outcome [13][14][15]. Industry-sponsored studies may also be more likely to incorporate short-term and surrogate outcomes in evaluating treatments for rare diseases [4], which could contribute to the prominence of pathophysiological end-points in studies of PKU. However, the relatively small number of articles on PKU that reported patient-oriented outcomes (as compared with blood phenylalanine) is potentially of concern.
Specifically, patient-oriented outcomes that reflect the lived experience of patients and their caregivers have emerged as a key priority for evaluative studies in the field of rare diseases [4]. The life impact core area, which covers many such outcomes, appeared relatively wellrepresented in articles reporting on MCAD deficiency, although most individual outcomes within this core area were themselves reported only once or twice, perhaps reflecting a lack of consensus on which aspects of life impacts are of highest priority for measurement. In the PKU literature, the life impact core area was less commonly represented but, similar to MCAD deficiency, it was by far the most diverse core area, with 44 unique outcomes. This suggests a need to work directly with patients and their family members to identify which patient-oriented outcomes are most meaningful to measure in future research for both diseases. Furthermore, there was diversity in the specific outcome measurement instruments reported for many of the patient-or caregiver-reported outcomes for both diseases. There are very few disease-specific questionnaires for MCAD deficiency and PKU, likely because the small patient populations make the development and validation of such outcome measurement instruments challenging [6]. Thus, in addition to understanding which outcomes are most highly prioritized, there is a need to select the generic instruments (or develop disease-specific instruments) that best capture the life impact of MCAD deficiency and PKU for patients and their families. This review has several strengths. The study followed a published protocol written in collaboration with patient partners, and established methodology as reported in the PRISMA statement and COMET handbook [7,18]. Our search for relevant articles was extensive, covering electronic databases of peer reviewed literature, supplemented by a grey literature search guided by Grey Matters [20], additional search strategies for long-term follow-up initiatives of newborn screening, and a search of the COMET database for pediatric COSs [8]. Through data extraction and synthesis we created a comprehensive list of all of the outcomes reported or recommended Eye health 2 (1%) 0 (0%) 0 (0%) 1 (1%) 1 (1%) [1][2][3][4][5][6][7][8][9][10] indicates top ten most reported or discussed unique outcomes, ties indicated with an asterisks; p indicates neuro-psychological outcomes in the MCAD deficiency and PKU literature that can be used to support the development of COSs for these diseases. Our review also has limitations. Despite our comprehensive search strategy, the non-specific nature of index terms for newborn screening long-term follow-up initiatives made searching for studies in that area challenging and we may have missed relevant published articles. Due to the size of the PKU literature we opted to extract literature from the year 2000 onward and we may have missed unique outcomes that were reported only before that date. We only included studies published in English for practical reasons and may have missed important literature published in other languages. We also originally planned to summarize outcomes by the age of the children studied and we considered a summary that accounted for disease severity, but we were unable to extract these variables consistently due to incomplete reporting of sample characteristics. Similarly, we did not report outcomes by other study characteristics such as follow-up time for longitudinal studies and we did not collect information about how often outcomes were collected in studies where repeat measurements would have been possible. Our findings suggest that evaluative studies of interventions for MCAD deficiency and PKU would benefit from COSs given the multitude of outcomes in the literature. The 83 MCAD deficiency and 97 PKU outcomes that we identified constitute a list of candidate core outcomes for the next stage of COS development, which involves a consensus process to narrow down the list to a small number of outcomes that are of highest priority for collection in future studies [7]. In order to make final recommendations about outcomes and also outcome measurement tools, the consensus process must involve multiple stakeholders, including patients and family members, to ensure that future evaluative research is patient-oriented and focused on meaningful outcomes. This is particularly important given the large number of outcomes and outcome measurement instruments that we identified in the life impact core area for both diseases, and the focus on pathophysiological rather than patient-oriented outcomes in the PKU literature.

Conclusions
Substantial heterogeneity exists in the outcomes reported in the MCAD deficiency and PKU literature and a diversity of outcome measurement instruments was used to measure many of these outcomes. This lack of consistency impedes comparisons among studies and limits the potential for data synthesis, leading to inefficient use of limited resources available for evaluating the effectiveness of new and existing interventions. Our findings suggest that future studies of the effectiveness and comparative effectiveness of interventions for pediatric MCAD deficiency and PKU would benefit from disease-specific COSs.