Skip to main content

Table 3 The 6 Data Quality Dimensions defined by DAMA UK Working Group for data quality assessment

From: An assessment of the quality of the I-DSD and the I-CAH registries - international registries for rare conditions affecting sex development

Data quality dimensions Definition Measure in the Registry
Completeness The proportion of stored data against the potential of “100% complete”. - Optional variables in Core data
-All variables by disorders in all data
-Optional variables by centresin all data, separately in the I-DSD and I-CAH Registries
-Optional data in CAH longitudinal module
Uniqueness No thing will be recorded more than once based upon how that thing is identified.
Uniqueness is the inverse of an assessment of the level of duplication.
Percentage of duplicated cases by measuring data item against itself. A case is presumed duplicated when there is 100% similarity in core data and more than 90% similarity in non-core data between duplicates.
Timeliness The degree to which data represent reality from the required point in time or how current or up to date the data are at the time of release. Timeframe between the age at first presentation and the upload date in the Registry
Validity Data are valid if it conforms to the syntax (format, type, range) of data definition. The percentage of data that are not conform to the syntax in the longitudinal module in the ICAH Registry (Blood pressure)
Accuracy The degree to which data correctly describes the “real world” object or event being described. The accuracy of data in PAISa cases in the Registry was verified against original data available in templates completed by centres
Consistency The absence of difference, when comparing two or more representations of a thing against a definition. Consistency between the number of adverse events episodesb and sick days in the longitudinal module of the I-CAH Registry
  1. (When the data provider enter a number of adverse events, a table with a number of rows corresponding to the number of adverse event episodes is displayed and in each row we need to complete the number of sick days in each adverse event episode. The total number of sick days is automatically calculated. Obviously, the number of adverse events should not exceed the number of sick days, otherwise, there is an inconsistency between the two variables)
  2. a PAIS Partial androgen insensitivity syndrome
  3. b Adverse events: are the number of separate episodes of illness requiring extra dose of steroid