- Poster presentation
- Open Access
Characterization and classification of Rare Disease Registries by using exploratory data analyses
Orphanet Journal of Rare Diseases volume 9, Article number: P4 (2014)
European Commission and Patients Associations identify Registries as strategic instruments to improve knowledge in the field of Rare Diseases [1, 2]. Interoperability between Rare Diseases Patient Registries (RDPR) is especially needed to support research activities, to validate therapeutic treatments and to plan public health actions. Because of the extreme variety of RDPR, a uniform and standardized way of collecting data and the identification of specific levels of connection between RDPR with similar aims is needed.
In this study, exploratory data analyses were applied to the EPIRARE (European Platform for Rare Diseases Registries) Registry Survey in order to generate a macro-classification and characterization of RDPR and to deepen different informative needs.
At first, a Multiple Correspondence Analysis (MCA) suggested associations between selected variables characterizing the structure of RDPR (Figure 1). Then, a Cluster analysis (CA) was developed using the declared “Aims” of each RDPR. CA confirmed the variable associations emerged by MCA and identified three groups defined as: Public Health (PHR), Clinical-Genetic Research (CGRR), and Treatment Registries (TR). Finally, the random forest (RF) method was applied to the Survey data, leading to six classification models endowed of good predictive power and thus confirming the reliability of considering three groups of RDPR. RF also identified several informative variables which allowed the characterization of the three categories of RDPR, defined by data of different nature and by different levels of diffusion (Table 1).
These results, identifying different profiles of RDPR and specific informative needs, represent an informative support aimed at addressing the activities for the design of an European platform of Rare Diseases. Identification of informative cores could address the activities of a platform able to enhance the sharing of information between RDPR with common aims, but also to facilitate a coherent dialogue between RDPR with different profiles.
Guide to interpretation: the arrows indicate the directions of association among the aims; the dimension of the circles represents the frequency of the variable. The higher are the coordinate and the frequency of the variable, the more it contributes to the interpretation of the factorial axis; variables placed on the same direction are correlated.
Commission of the European Communities: Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions on Rare Diseases: Europe's challenges. 2008, Brussels, COM(2008) 679 final. Available at: http://ec.europa.eu/health/ph_threats/non_com/docs/rare_com_en.pdf
Council recommendation of 8 June 2009 on an action in the field of rare diseases. Official Journal of the European Union. 2009/C 151/02. Available at:http://eurlex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:C:2009:151:0007:0010:EN:PDF
This work is part of the activities of EPIRARE, a 3-year project started on April 15, 2011 (grant 2010 12 02) and co-funded by the European Commission within the EU Programme on Health.
Alessio Coi, Michele Santoro contributed equally to this work.
About this article
Cite this article
Coi, A., Santoro, M., Lipucci, M. et al. Characterization and classification of Rare Disease Registries by using exploratory data analyses. Orphanet J Rare Dis 9, P4 (2014). https://doi.org/10.1186/1750-1172-9-S1-P4
- Random Forest
- Rare Disease
- Treatment Registry
- Informative Support
- Exploratory Data Analysis