Does the registry speak your language? A case study of the Global Angelman Syndrome Registry
Orphanet Journal of Rare Diseases volume 18, Article number: 330 (2023)
Global disease registries are critical to capturing common patient related information on rare illnesses, allowing patients and their families to provide information about their condition in a safe, accessible, and engaging manner that enables researchers to undertake critical research aimed at improving outcomes. Typically, English is the default language of choice for these global digital health platforms. Unfortunately, language barriers can significantly inhibit participation from non-English speaking participants. In addition, there is potential for compromises in data quality and completeness. In contrast, multinational commercial entities provide access to their websites in the local language of the country they are operating in, and often provide multiple options reflecting ethnic diversity. This paper presents a case study of how the Global Angelman Syndrome Registry (GASR) has used a novel approach to enable multiple language translations for its website. Using a “semi-automated language translation” approach, the GASR, which was originally launched in English in September 2016, is now available in several other languages. In 2020, the GASR adopted a novel approach using crowd-sourcing and machine translation tools leading to the availability of the GASR in Spanish, Traditional Chinese, Italian, and Hindi. As a result, enrolments increased by 124% percent for Spain, 67% percent for Latin America, 46% percent for Asia, 24% for Italy, and 43% for India. We describe our approach here, which we believe presents an opportunity for cost-effective and timely translations responsive to changes to the registry and helps build and maintain engagement with global disease communities.
Angelman syndrome (AS) is a severe neurodevelopmental disorder caused by dysfunction of the maternally inherited UBE3A gene. It is estimated that 500,000 people live with AS worldwide . Global rare disease registries are a valuable tool for enhancing therapeutics in rare diseases, enabling participant recruitment and capture and monitoring of patient reported outcomes, amongst other uses. Despite the need to be Global, there is a lack of diversity in terms of language available on global registry websites. This is, unfortunately, common in medicine and science where the fact that the common scientific language is English has spilled over into an apparent insistence that participants in research from non-English speaking countries must be done in English. In contrast, no multinational commercial entity would survive if it took this approach and as a result, they are available in myriad local languages. For Example, the website for the global movement Rare Disease Day is available in 103 languages besides English , while the European Commission websites strive to be available in all 24 recognised European languages . Registry development guidelines including the fourth edition of the guide Registries for Evaluating Patient Outcomes released by the Agency for Healthcare Research and Quality  and Rare Diseases Registry Program , stress the importance of careful translation of multinational registries. Other international registries such as the Hyperinsulinism Global Registry or Global Prader Willi Registry are intending to incorporate multiple languages [6, 7]. Another strategy is to establish a federation of linked registries for a rare disease to achieve global coverage of patients . Towards this end, the Global Angelman Syndrome Registry is open to data linkages with other data sets including Natural History Studies and registries such as the Angelman Syndrome Online Registry . Barriers to making services multilingual include lack of access to translators, and the need for technical expertise or an understanding of the topic of the registry. Tools such as google translate have demonstrated that technology can be used to assist with this, although native speaker input is still required to ensure accuracy and readability of translations.
Findings from a review of articles on methodological approaches to the cross-cultural adaptation of surveys and tools indicated that translators should be fluent in the source and target languages, understand both cultures, and knowledgeable about the content of the instrument being adapted . Addressing each of these requirements may be challenging, as professional translators may not be subject matter experts and will lack specialised content knowledge. Involving more than one translator in the process may be beneficial to offer a mix of perspectives with respect to language fluency, cultural understanding, and content knowledge. However, this may prove difficult due to challenges around document sharing version control, and managing division of workload inhibiting translator interactions. Additionally, reconciliation and review of translations by an expert panel, and cognitive interviews or pilot testing with focus groups should be undertaken to determine the face and content validity of translated instruments . There is limited evidence the value of back translations .
Use of technology to facilitate translations
Machine translation (MT) involves using software tools to translate text or speech from the source language to the target language . The process is automated and may involve different approaches including rules created by linguists and computer scientists, examples from a database of source and target language sentences, and statistical modelling of the probability that a target sentence is the correct translation of a source sentence .
Translation memories (TM) are a related technology which involves storing previously completed human translations, including the source text and translated text, in a database and matching segments of text, such as a sentence, from the TM database with new source text to create translations . Matches may be exact, or identical including formatting; full, with differences such as numbers or dates; or fuzzy, which is similar but requires editing .
Crowdsourcing refers to an organisation (such as a research institution or not for profit) outsourcing a task previously undertaken internally to an external community to complete a task or solve a problem for mutual benefit . In research, crowdsourcing has been used for a range of tasks including identification and classification, transcription or translation, and data collection and analysis . Organisations including Cochrane and Technology, Entertainment and Design (TED) talks involve volunteer translators to translate resources in recognition of the fact that most people globally do not speak English as a first language .
Rare disease registries may present a unique opportunity for crowdsourcing translations, as rare disease communities often drive the development of registries and have strong involvement in registry governance and ownership. While crowdsourcing may seem advantageous in this context, projects must be managed effectively to prevent negative outcomes such as translator or researcher burnout or malicious translations. Blohm et al.  reviewed the management and governance of a variety of crowdsourcing projects and determined a four-step process for running a crowdsourcing project: (1) Define Goal and System Type; (2) Start Small and Experiment; (3) Build up Scalable Structures and (4) Adapt and Monitor Governance (p. 143).
The Global Angelman Syndrome Registry was launched in English in September 2016 . The registry was sponsored by the Foundation for Angelman Syndrome Therapeutics (FAST) Australia. The aims of the registry include:
Facilitate participant recruitment for clinical trials;
Collect the natural history of a large cohort of individuals with AS;
Identify demographic, phenotypic and genotypic variation in clinical features and outcomes; and
Aid in service provision planning for individuals with AS and their families.
The registry was originally deployed using the Rare Disease Registry Framework (RDRF) [18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33], an internet-based modular registry framework, developed by the Centre for Comparative Genomics at Murdoch University.
Since its launch, AS organisations have expressed an interest in translation into multiple languages. Initially, the translation process was to include three steps: (1) Forward translation; (2) Back translation; and (3) Pilot testing. The forward translation process was completed for Italian, Spanish, French and Hebrew, and partially completed for Portuguese and Chinese.
In 2020, the registry was moved to the Trial Ready Registry Framework (TRRF) [30, 32, 34,35,36,37,38,39]. The new platform incorporated significant revisions based on feedback from families, clinicians and researchers. Changes included revisions to both content and functionality to simplify the user experience in completing forms, enable longitudinal data collection and user managed linkages with clinicians and researchers, and integrate translations and analytics.
Due to changes to the registry content and function, and the likelihood of the requirement for updates for existing translations and the addition of more languages, alternative methodologies for translations were explored that were less burdensome on the community and research team. The current study reports on the establishment of the GASR translation project.
To facilitate timely translation of the GASR, Crowdin  was selected as a tool to integrate existing translations (converted to TM) and MT provided by Crowdin software with crowdsourcing translations from the Angelman community and manage the translation process.
The GASR features a series of forms, or modules that collect information on an individual with AS’s condition. The forms cover demographic, clinical, behavioural and developmental information. The content of the GASR modules and other patient facing information including the registration form, standard emails, and website messages constituted the information to be translated. There were approximately 13 000 words on the GASR across the 20 sections to be translated.
Prior to the availability of the translations, 1625 families have joined the registry, 1614 of whom had provided geographic data. As shown in Table 1, most families were from English speaking countries.
Governance and management of crowdsourcing
A Crowdin Enterprises project, hereafter referred to as Crowdin, (https://eresearchqut.crowdin.com/) was established to manage the project. Crowdin allows for different levels of access to ensure that participants only have access to tasks to which they are assigned by the project administrator. As the registry was pre-translated, the volunteers were assigned one of two roles: community proofreader or final proofreader. Community proofreaders review and modify existing machine translations, while final proofreaders are trusted professionals from the AS community who review and correct community proofread translations.
Proofreaders access the registry content via a testing site located at https://trrf.qa.angelmanregistry.info/. The website enables users to view the translations in the context of the website and access the Crowdin editor tool for each string. The editor tool shows the source text and several translations obtained from translation memory files or machine translations. The proofreader can revise, add or approve translations, and leave comments for other project team members.
Community proofreaders participated in a small group training session with one of the authors (M.T.), who explained the workflow of the project and how to use the Crowdin tool. For the purposes of scalability, one language was selected for pilot testing and refining the translation process. To date, training sessions have been held with community proofreaders for the Italian, Spanish, Chinese, Portuguese and French languages. Hindi was subsequently translated by an external company capable of interfacing with Crowdin.
The author running the training sessions checked in with a nominated translator from each group weekly to receive updates about progress and obtain feedback.
The registry team submitted ethics amendments to incorporate translations into the protocol. Approval was granted for the translation of registry materials using the methods described above from the Mater Health Services Human Research Ethics Committee (HREC/13MHS/76/ Project 20,865).
The Spanish, Traditional Chinese, Italian and Hindi versions of the registry were launched in 2022 on the 6th January, 22nd March, 27th of April and 13th October respectively. Growth in registry enrolments post translations for each country or region where the language is spoken are shown in Fig. 1, along with current totals.
Observations and feedback from the translation process
The Crowdin tool was user friendly. Proofreaders utilised a mix of the Crowdin editor tool and in-context editing tool. The editor tool displayed the source text and available translations side by side, enabling users to correct existing translations or add new translations. The Crowdin editor tool was the primary method used for proofreading, as all strings were displayed in the interfacing, ensuring comprehensive proofreading. The in-context editing tool was implemented on a test website which replicated the GASR site, which was valuable to view how the translations would appear to users in the context of the registry. A list of examples of source text, machine translations, and community and professional proofreading corrections is shown in Table 2.
A group of at least three proofreaders was advantageous
Larger groups reduced the workload for individual proofreaders and enabled more comprehensive review of translations prior to the final proofreading step. For instance, the Spanish proofreading group identified that the word for “boy” and “child” was the same and could thus lead to Spanish speaking families reading sections of the registry as “boy/ adult” rather than “child/ adult.” The author in communication with the translation team (MT) was able to put them in contact with another author (RB) whose first language was Spanish.
Preliminary validation findings are encouraging
Although validation is a separate step beyond translation, the authors compared 107 English and 55 Spanish responses to the Newborn and Infancy module completed since the translations were implemented. This module was selected as being the first module users encounter in the registry, it had the highest completion rate, and responses were thought to be less impacted by the age and genotype of the person with Angelman syndrome. Responses to Likert scale items from the Newborn and Infancy module are shown in Table 3. A series of 25 Chi square tests were conducted, with a Bonferroni adjustment indicating an adjusted alpha level of p = 0.002. Out of the 25 questions, only two demonstrated significant differences between the English and Spanish samples, reflecting that:
Spanish speaking parents perceived their infant with Angelman syndrome to be placid more frequently than English speaking parents.
English speaking parents perceived their infant with Angelman syndrome to experience more frequent reflux/gastro/oesophageal problems than Spanish speaking parents.
The GASR was translated into Spanish, Traditional Chinese, Italian and Hindi. After completion of the community and final proofreading steps, acceptable translations were obtained and made available to the Angelman community. During our experience of managing and governing the translation and proofreading project, we refined the process to reduce burden on the research team and our proofreaders for future translations utilising crowdsourcing . These relate to the process of proofreading, and management of translation projects.
With respect to establishing translation project for future languages, our first step is to source machine translations to create the initial language translation on Crowdin. The second step is to break the registry content into individual tasks based on word count and create a document with (1) links to the Crowdin editor for each task, (2) task name and description, and (3) task word count. The third step is to administer training covering completing tasks in the editing tool, and access to the in-context tool for reviewing content within the website to proofreaders. The translation projects would continue to be managed by the author (MT). Further to this, greater efforts would be made to validate the translations generated. An initial validation of the Newborn and Infancy module of the registry was promising, with few differences between English and Spanish responses.
Potential limitations to the crowdsourcing approach
There were two possible limitations identified in the current study. These limitations relate to the translations, but may also be relevant to validation testing.
Participants were time poor
In some cases, participants were unavailable to complete translation tasks due to competing priorities and responsibilities. Future strategies to assist families may include recruiting a larger number of proofreaders, facilitating support and connection between proofreaders, ensuring that larger proofreading tasks are broken down into smaller chunks, and providing incentives such as a donation to their local Angelman organization.
Community proofreaders were difficult to source for some languages
As participation in the registry was very low for some regions, such as Asia, Africa and the Middle East, it was difficult to source proofreaders for languages spoken in these regions such as Arabic or Hindi. As a result, the team opted for professional translation via vendors who can integrate with Crowdin, with Crowdsourcing reserved for registry revisions once families have become more engaged for the Hindi language.
Crowdsourcing was an effective tool for upgrading translations, facilitating proofreading and integrating translated versions of the Global Angelman Syndrome Registry on an online platform. The availability of translations has led to greater participation and engagement of Angelman populations from regions where Spanish, Italian, Traditional Chinese and Hindi are spoken. The use of Crowdsourcing via online translation software such as Crowdin helps to manage ongoing translation and proofreading needs for research projects and maintain community participation and buy-in. However, further efforts are needed beyond translation to validate the translation of the registry for different communities.
Availability of data and materials
Data from the Global Angelman Syndrome Registry is available upon request, subject to ethical and legal safeguards. Interested parties can request access to the data by submitting a request form at: https://www.angelmanregistry.info/registry-data-request/. Alternately, a deidentified data set is currently being made available by the Critical Path Institute: https://c-path.org/programs/rdca-dap/overview/platform/. Please email email@example.com or firstname.lastname@example.org for more information.
Foundation for Angelman Syndrome Therapeutics [FAST]. What is Angelman Syndrome. Accessed October 4, 2022. https://cureangelman.org.au/what-is-angelman/
Rare disease day. Updated 2023. Accessed January 25, 2023. https://www.rarediseaseday.org/
European Commission. Updated 2023. Accessed January 25, 2023. https://commission.europa.eu/index_en
Agency for Healthcare Research and Quality ([AHRQ], 2020). Registries for Evaluating Patient Outcomes. Published [Date]. Accessed [Date]. URL
National Centre for Advancing Translational Sciences [NCATS]. Rare Diseases Registry Program. (RaDaR). Accessed January 25, 2023. https://registries.ncats.nih.gov/
Pasquini TLS, Mesfin M, Raskin J. HI Global Registry 2022 Annual Report. Published [Date]. Accessed January 27, 2023. https://congenitalhi.org/wp-content/uploads/2022/12/HI-Global-Registry-2022-Annual-Report.pdf
Bohonowych J, Miller J, McCandless SE, Strong TV. The global Prader-Willi syndrome registry: development, launch, and early demographics. Genes (Basel). 2019;10(9):713. https://doi.org/10.3390/genes10090713.
Forrest C, Bartek RJ, Rubinstein Y, Groft SC. The case for a global rare-diseases registry. Lancet. 2011;377(9771):1057–9. https://doi.org/10.1016/S0140-6736(10)60680-0.
Krey I, Heine C, Frömming M, Herrmann J, Møller RS, Weckhuysen S, Courage C, Beblo S, Syrbe S, Lemke JR. The Angelman syndrome online registry—a multilingual approach to support global research. Eur J Med Genet. 2021;64(12):104349. https://doi.org/10.1016/j.ejmg.2021.104349.
Epstein J, Santo RM, Guillemin F. A review of guidelines for cross-cultural adaptation of questionnaires could not bring out a consensus. J Clin Epidemiol. 2015;68(4):435–41. https://doi.org/10.1016/j.jclinepi.2014.11.021.
Anastasiou D, Gupta R. Comparison of crowdsourcing translation with machine translation. J Inf Sci. 2011;37(6):63–9. https://doi.org/10.1177/0165551511418760.
Qun L, Xiaojun Z. Machine translation. In: Chan S, editor. The Routledge encyclopedia of translation technology. Abingdon: Routledge; 2015. p. 105–19.
O’Hagan M. Computer-aided translation (CAT). In: Baker M, Saldanha C, editors. Routledge encyclopedia of translation studies. Abingdon: Routledge; 2009. p. 48–51.
Wazny K. Crowdsourcing” ten years in: a review. J Glob Health. 2017;7(2):020602. https://doi.org/10.7189/jogh.07.020602.
Bassi H, Lee CJ, Misener L, Johnson AM. Exploring the characteristics of crowdsourcing: an online observational study. J Inf Sci. 2020;46(3):291–312. https://doi.org/10.1177/0165551519828626.
Behmen D, Marušić A, Puljak L. Capacity building for knowledge translation: a survey about the characteristics and motivation of volunteer translators of Cochrane plain language summaries. J Evid Based Med. 2019;12(2):147–54. https://doi.org/10.1111/jebm.12345.
Blohm I, Zogaj S, Bretschneider U, Leimeister JM. How to manage crowdsourcing platforms effectively. Calif Manag Rev. 2018;60(2):122–49. https://doi.org/10.1177/000812561773825.
Tones M, Cross M, Simons C, Napier KR, Hunter A, Bellgard MI, Heussler H. Research protocol: the initiation, design and establishment of the Global Angelman Syndrome Registry. J Intellect Disabil Res. 2018;62(5):431–43. https://doi.org/10.1111/jir.12482.
Hammond E, Youngs L, Bellgard M, Dawkins HSP. 33 Australasian neuromuscular disease registry (conference abstract). Neuromuscul Disorders. 2012;22(9–10):881. https://doi.org/10.1016/j.nmd.2012.06.258.
Bellgard M, Macgregor A, Janon F, et al. A modular approach to disease registry design: successful adoption of an internet-based rare disease registry. Hum Mutat. 2012;33(10):E2356–66. https://doi.org/10.1002/humu.22154.
Bellgard M, Beroud C, Parkinson K, et al. Dispelling myths about rare disease registry system development. Source Code for Biol Med. 2013;9(1):1–7. https://doi.org/10.1186/1751-0473-9-4.
Bellgard M, Render L, Radochonski M, Hunter A. Second generation registry framework. Source Code for Biol Med. 2014;9(14):1–6. https://doi.org/10.1186/1751-0473-9-14.
Bladen C, Salgado D, Monges S, et al. The TREAT-NMD DMD Global database: analysis of more than 7,000 Duchenne muscular dystrophy mutations. Hum Mutat. 2015;36(4):395–402. https://doi.org/10.1002/humu.22758.
Bellgard M, Walker C, Napier K, et al. Design of the familial hypercholesterolaemia Australasia network registry: creating opportunities for greater international collaboration. J Atheroscler Thromb. 2017;24(10):1075–84. https://doi.org/10.5551/jat.37507.
Napier K, Pang J, Lamont L, et al. A web-based registry for familial hypercholesterolaemia. Heart Lung Circ. 2017;26(6):635–9. https://doi.org/10.1016/j.hlc.2016.10.019.
Koeks Z, Bladen C, Salgado D, et al. Clinical outcomes in duchenne muscular dystrophy: a study of 5345 patients from the TREAT-NMD DMD Global Database. J Neuromuscul Dis. 2017;4(4):293–306. https://doi.org/10.3233/JND-170280.
Napier K, Tones M, Simons C, et al. A web-based, patient driven registry for Angelman syndrome: the global Angelman syndrome registry. Orphanet J Rare Dis. 2017;12(134):1–5. https://doi.org/10.1186/s13023-017-0686-1.
Bellgard M, Napier K, Bittles A, et al. Design of a framework for the deployment of collaborative independent rare disease-centric registries: gaucher disease registry model. Blood Cells Mol Dis. 2018;68:232–8. https://doi.org/10.1016/j.bcmd.2017.01.013.
Ng D, Hooper A, Bellgard M, Burnett J. The role of patient registries for rare genetic lipid disorders. Curr Opin Lipidol. 2018;29(2):156–62. https://doi.org/10.1097/MOL.0000000000000485.
Schultz A, Marsh JA, Saville BR, et al. Trial refresh: a case for an adaptive platform trial for pulmonary exacerbations of cystic fibrosis. Front Pharmacol. 2019;28(10):301. https://doi.org/10.3389/fphar.2019.00301.PMID:30983998;PMCID:PMC6447696.
Napier KR, Hooper AJ, Ng DM, et al. Design, development and deployment of a web-based patient registry for rare genetic lipid disorders. Pathology. 2020;52(4):447–52. https://doi.org/10.1016/j.pathol.2020.02.002.
Ramsay J, Marsh J, Pedrana A, et al. A platform in the use of medicines to treat chronic hepatitis C (PLATINUM C): protocol for a prospective treatment registry of real-world outcomes for hepatitis C. BMC Infect Dis. 2020;20:802. https://doi.org/10.1186/s12879-020-05531-4.
Graham C, Molster C, Baynam G, et al. Current trends in biobanking for rare diseases: a review. J Biorepos Sci Appl Med. 2014;2:49–61. https://doi.org/10.2147/BSAM.S46707.
Vucic S, Wray N, Henders A, et al. MiNDAUS partnership: a roadmap for the cure and management of motor Neurone disease. Amyotroph Lateral Scler Frontotemporal Degener. 2022;23(5–6):321–8. https://doi.org/10.1080/21678421.2021.1980889.
Bellgard MI, Snelling T, McGree JM. RD-RAP: beyond rare disease patient registries, devising a comprehensive data and analytic framework. Orphanet J Rare Dis. 2019. https://doi.org/10.1186/s13023-019-1139-9.
Leader G, Whelan S, Chonaill NN, et al. Association between early and current gastro-intestinal symptoms and co-morbidities in children and adolescents with Angelman syndrome. J Intellect Disabil Res. 2022;66(11):865–79. https://doi.org/10.1111/jir.12975.
Leader G, Gilligan R, Whelan S, et al. Relationships between challenging behavior and gastrointestinal symptoms, sleep problems, and internalizing and externalizing symptoms in children and adolescents with Angelman syndrome. Res Dev Disabil. 2022;128(104293):1–13. https://doi.org/10.1016/j.ridd.2022.104293.
Roche L, Tones M, Cross M, et al. An overview of the adaptive behaviour profile in young children with Angelman Syndrome: insights from the global Angelman Syndrome Registry. Adv Neurodev Disord. 2022;6:442–55. https://doi.org/10.1007/s41252-022-00278-2.
Roche L, Tones M, Williams MG, et al. Caregivers report on the pathway to a formal diagnosis of angelman syndrome: a comparison across genetic etiologies within the Global Angelman Syndrome Registry. Adv Neurodev Disord. 2021;5:193–203. https://doi.org/10.1007/s41252-021-00195-w.
Crowdin Enterprise (n.d.). https://crowdin.com/enterprise. Accessed 20 February 2023
We would like to thank the parents and children for the time they took to complete the registry. The authors would like to acknowledge the anonymous proofreaders who generously gave their time and expertise to translate and review the registry content.
Ethics approval and consent to participate
Ethical approval for the Global Angelman Syndrome Registry including the translation process was provided by the Mater Misericordiae Ltd Human Research Ethics Committee (Approval Number EC00332).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Tones, M., Zeps, N., Wyborn, Y. et al. Does the registry speak your language? A case study of the Global Angelman Syndrome Registry. Orphanet J Rare Dis 18, 330 (2023). https://doi.org/10.1186/s13023-023-02904-1