- Open Access
Genetic syndromes screening by facial recognition technology: VGG-16 screening model construction and evaluation
Orphanet Journal of Rare Diseases volume 16, Article number: 344 (2021)
Many genetic syndromes (GSs) have distinct facial dysmorphism, and facial gestalts can be used as a diagnostic tool for recognizing a syndrome. Facial recognition technology has advanced in recent years, and the screening of GSs by facial recognition technology has become feasible. This study constructed an automatic facial recognition model for the identification of children with GSs.
A total of 456 frontal facial photos were collected from 228 children with GSs and 228 healthy children in Guangdong Provincial People's Hospital from Jun 2016 to Jan 2021. Only one frontal facial image was selected for each participant. The VGG-16 network (named after its proposal lab, Visual Geometry Group from Oxford University) was pretrained by transfer learning methods, and a facial recognition model based on the VGG-16 architecture was constructed. The performance of the VGG-16 model was evaluated by five-fold cross-validation. Comparison of VGG-16 model to five physicians were also performed. The VGG-16 model achieved the highest accuracy of 0.8860 ± 0.0211, specificity of 0.9124 ± 0.0308, recall of 0.8597 ± 0.0190, F1-score of 0.8829 ± 0.0215 and an area under the receiver operating characteristic curve of 0.9443 ± 0.0276 (95% confidence interval: 0.9210–0.9620) for GS screening, which was significantly higher than that achieved by human experts.
This study highlighted the feasibility of facial recognition technology for GSs identification. The VGG-16 recognition model can play a prominent role in GSs screening in clinical practice.
Genetic syndromes (GSs) refer to specific manifestations with multiple clinical features that are caused by genetic abnormalities. Genetic abnormalities can vary from subtle to prominent and from a discrete mutation in a single base on the DNA sequence of a single gene to a gross chromosomal abnormality . Each particular genetic syndrome (GS) presents with characteristic features depending on the developmental aspects affected by the abnormal genes or chromosomes. Although individual cases are rare, GSs collectively affect a significant proportion of the general population, with the majority being children [2, 3]. Children with GSs often suffer repeat admissions, long-term care, and impaired quality of life which may lead to heavy social and family burdens .
Timely diagnosis of GSs is crucial for genetic counselling and can improve outcomes. With the development of next-generation sequencing, GS research is becoming extensive, and gene examination is considered the “gold-standard” method for GS diagnosis . However, gene testing is expensive and time-consuming. In clinical practice, gene examination for all patients is unrealistic. Therefore, the main question has become “how can we screen suspected GS patients for further investigation?”
Many GSs have distinct facial dysmorphism, and the recognition of a syndrome from a facial gestalt can be the first step in making a diagnosis . However, due to the variation and complexity in phenotyping, combined with the inexperience of general practitioners, the memorization of different facial gestalts and recognition of rare GSs is a challenging task. Facial recognition technology has been widely applied in several fields, and artificial intelligence has been integrated into routine clinical practice specifically for diagnostic support. With recent advancements in deep convolutional neural networks (CNNs), screening and diagnosis of GSs through facial feature recognition has become possible . In the present study, we developed a facial recognition model based on the VGG-16 architecture (named after its proposal lab, Visual Geometry Group from Oxford University) for identifying GS children from healthy children, and the performance of the model was also evaluated.
Materials and methods
Patients and facial photos
A total of 228 children with GSs and 228 healthy children were recruited from Guangdong Provincial People's Hospital from Jun 2016 to Jan 2021. The demographic characteristics of the participants are shown in Table 1.
The GS diagnosis was confirmed by karyotyping, array comparative genomic hybridization or next-generation sequencing. The following syndromes were included: Williams-Beuren syndrome (n = 108), Noonan syndrome (n = 52), Down syndrome (n = 14), Marfan syndrome (n = 5), Loeys-Dietz syndrome (n = 4), Alagille syndrome (n = 4), DiGeorge syndrome (n = 3), Acromicric and Geleophysic dysplasia (n = 3), Kabuki syndrome (n = 2), Barth syndrome (n = 2), Cornelia de Lange syndrome (n = 2), Koolen-de Vires syndrome (n = 2). 14q32 duplication syndrome (n = 2). Congenital mental retardation, AD (n = 2). 8p23.1 deletion syndrome (n = 2). 21q22.3 deletion syndrome (n = 2). Helsmoortel-Van der Aa syndrome (n = 1). Mulibrey nanism (n = 1). Congenital fibrosis of extraocular muscles (n = 1). Mandibulofacial dysostosis-microcephaly syndrome (n = 1). Cerebro-oculo-facio-skeletal syndrome (n = 1). Oculo‐facio‐cardio‐dental syndrome (n = 1). Wolf-Hirschhorn syndrome (n = 1). Costello syndrome (n = 1). Cri du Chat syndrome (n = 1). Stickler syndrome (n = 1). Coffin-Siris syndrome (n = 1). Klippel-Feil syndrome (n = 1). Congenital contractural arachnodactyly (n = 1). 16p11.2 duplication syndrome (n = 1). Holt-Oram syndrome (n = 1). X-Linked Oto-palato-digital Spectrum Disorders (n = 1). 16p11.2 microdeletion syndrome (n = 1). Brittle cornea syndrome (n = 1). 18q microdeletion Syndrome (n = 1). In total, there were 35 different genetic syndromes.
Three to ten frontal facial photos were taken depicting the entire frontal face from hairline to chin, exposing the ears, with opened eyes looking straight ahead. Only one clear frontal facial photo was selected for each participant (avoid those with obvious “open mouth” as much as possible). A total of 456 frontal facial photos were collected from 228 children with GSs and 228 healthy children. Facial images of children with GSs are presented in Fig. 1.
This study was approved by the Research Ethics Committee of Guangdong Provincial Peoples’ Hospital (Project Number: KY2020-033-01). Informed consent was given by all patients or their wardens to analyse.
The hardware used for the study was an NVIDIA Tesla P100 GPU (NVIDIA Corporation, California, USA) with 16 GB RAM and 4096 bits. An Ubuntu18.04 operation system (Canonical Ltd, UK) was used. Networks were based on TensorFlow (Google Inc, California, USA).
The study process can be summarized as follows: (1) VGG-16 networks were pretrained through transfer learning methods by VGG-Face CNN descriptors and obtained initializing weights. (2) Face detection from photographs was performed by multitask Convolutional Neural Network (MTCNN), thus achieving five characteristic markers in each photograph. (3) By randomly rotating, cropping or horizontally flipping the detected face, a group of facial images of size 224 × 224 × 3 (RGB) was obtained as the data inputs. (4) A facial recognition model based on the VGG-16 architecture was constructed, and the performance was evaluated by five-fold cross-validation. (5) Gradient-weight class activation mapping (Grad-CAM) was produced to highlight key regions in the facial images, which were processed and recognized by the model. (6) The performance of VGG-16 model was compared to that of five physicians.
MTCNN was used for face detection and alignment. The MTCNN contained an image pyramid and a three-stage cascaded framework: proposal network, refine network and output network, finally generated a facial image (224 × 224 × 3 pixels) with five facial landmark positions (left eye, right eye, nose, left mouth corner, and right mouth corner) for each inputted facial photo. The pixel value of the image was scaled and normalized from 0 to1. The Dataset was augmented by random rotation, cropping and horizontal flipping.
We used VGG-16 as our network architecture, and we started transfer learning by initializing the network with pretrained weights from VGG-Face, an open-source face data model supplied by the Oxford Visual Geometry Group (UK). The primary algorithm included softmax for classification training, a triplet loss function for feature extraction training, and the RMSProp optimization method for parameter update.
Model construction and training
A facial recognition model based on the VGG-16 architecture was constructed. The VGG-16 architecture comprised 13 convolutional layers, followed by maximum pooling layers, three fully connected layers, and a softmax output. We replaced the fully connected layers with convolutional layers with a 50% dropout. This improvement enhanced the generalization ability, while diminishing the computing capacity and time spent. The convolutional layer convolved the input data and was connected to a rectified linear unit (ReLU) activation function after batch normalization. Following the convolution layer operations, the data were finally outputted via softmax; then, the probability of GS was predicted. A maximum pooling layer placed between two groups of convolution layers to downsample the output data was used to reduce the computational complexity and avoid overfitting. Softmax predicted the probability of input image data being GS-specific faces (Fig. 2).
In the experiment, five-fold cross-validation was adopted. The proportion of the training set, validation set, and test set was 3:1:1. Both the GS and non-GS facial image data were randomly split into five subsets. The GS and non-GS data were distributed equally in each subset.
To understand the features learned by the VGG-16 model and their locations, we used Grad-CAM to highlight key regions in the facial images that influenced the decision-making by the model (Fig. 3). The code is available at https://github.com/ramprs/grad-cam/, proposed by Selvaraju et al.  from the USA.
Comparison of the model with paediatricians
Three junior paediatricians (those with 3–5 years of experience) and two senior paediatricians (those with more than 15 years of experience) were invited to recognize GS patients based solely on facial photos. One senior paediatrician had received genetics training. The other paediatricians had no experiences with genetic training, but all of them had once managed children with genetic syndromes in daily clinical practice. Each face image was shown for 10 s without exhibiting other clinical data. Based on the photo image from the dataset, the physicians determined whether an individual was suffering from a GS. The classification performance of the VGG-16 model was compared to these five paediatricians.
Identification results were noted as TP (true positive), FP (false positive), TN (true negative), and FN (false negative). The classification performance of the proposed VGG-16 model was quantified by accuracy, recall, specificity, precision, F1-score, and area under the receiver operating characteristic curve (AUC). The identification performance of the paediatricians was quantified by accuracy, sensitivity (the same as recall), specificity, precision, and F1-scorec. These measures were calculated as follows:
Model performance measurements were reported as the mean ± standard deviation of five testing results obtained from the cross-validation. Receiver operating characteristic curve (ROC) with 95% confidence interval (CI) of VGG-16 model was calculated and plotted by using package pROC 18.104.22.168 in R 3.6.1 with 200 iterations of bootstrapping. To compare the classification performance of the VGG-16 model and physicians, the sensitivity/specificity point of each physician was plotted on the ROC space of the VGG model. When the sensitivity/specificity point of physician lies outside of the 95% CI space of the ROC curve of VGG model, the classification performance of VGG model and physician are defined as statistically difference . Pearson’s chi-squared-test was applied to compare the gender proportions, and an independent-sample t-test was used to compare the age at photograph between the groups. P-values < 0.05 were considered statistically significant.
Visual explanations by feature maps
Weighted feature maps were computed by the ReLU activation function, reserving the class features and abandoning the unrelated features; then, the values were normalized into the range 0–255. From the colour band, the size of the values corresponded to the colour brightness. In most cases, the expression was brighter for higher values, and it represented relatively more significant regions on the face (Fig. 3). Class activation maps matched the dysmorphic facial features well in 217 GS images. In the other 11 GS photos, the class-discriminative regions were focused not only on the facial regions, but also on the hair or clothes.
Comparison with human experts
The performance results of the five paediatricians are shown in Table 2. One of the senior paediatrician, who had genetics training experience, achieved the best accuracy (0.7983) and sensitivity (0.8772). The sensitivity/specificity point of each physician was outside of the 95% CI space of the ROC curve of VGG-16 model, indicating that the identification performance of each participating paediatrician was inferior to that of the VGG-16 model (Fig. 4).
GSs often present with characteristic phenotypes that include dysmorphic features and characteristic facial gestalts. These craniofacial alterations can provide clinicians with important diagnostic clues. For instance, Down syndrome has a disease-specific facial profile that can be recognized easily. There are approximately 7000 genetic syndromes, the vast majority of which are rare diseases, and the characteristic craniofacial features are often unfamiliar to general physicians and paediatricians. However, with technical advancement in computing, GS facial recognition is becoming easily available. Loos et al.  first reported that GSs can be identified by using facial resemblance and a traditional machine learning method, with an accuracy of 83%. With improvements in data storage and computational power, deep CNN has become the most important facial recognition method.
In 2014, Face2Gene (http://www.face2gene.com/, FDNA Inc., Boston, USA), based on the DeepGestalt framework (one of the deep CNN algorithms), was introduced for GS facial recognition . When a facial photo is uploaded, Face2Gene produces a ranked list of 30 types of possible GSs. The performance of Face2Gene is evaluated using “top-10 accuracy”, which is the likelihood that one of the 10 syndromes with the highest probabilities suggested by Face2Gene is the actual syndrome. Studies showed that Face2Gene can help discriminate between different types of GSs [12, 13]. Hsieh et al.  introduced an approach that used portrait photographs for the interpretation of clinical exome data. This study indicated that image analysis by DeepGestalt could quantify the phenotypic similarity to advance the performance of bioinformatics pipelines for exome analysis. However, each uploaded facial image is defaulted as “abnormal” by Face2Gene. Even if a photograph of a healthy child is input into Face2Gene, a list of 30-type candidate GSs is produced, implying that the software lacks a screening function. Hence, developing a facial recognition model for screening GSs is necessary. In 2020, Pantel et al.  analysed a total of 646 images of 323 patients with 17 different genetic syndromes and matched individuals without a genetic syndrome. A face recognition model, which is driven by support vector machine running on the top of DeepGestalt framework, was introduced in this study. This novel approach could fairly separate images of individuals with and without a genetic syndrome.
VGG-net, proposed by the Visual Geometry Group (VGG) Lab of Oxford University, is a popular CNN architecture. VGG-16 is characterized by its simplicity in using only 3 × 3 convolutional layers stacked on top of each other in increasing depth. The increased depth and smaller kernel can diminish the network parameters, thus promoting the fitting capacity and wide clinical application. This network has been widely applied in computer vision fields. Recently, the medical applications of VGG-16 have been reported. Related works cover areas on the identification of tumour properties, disease staging on medical image data, retinal fundus image interpretation, etc. [16,17,18,19,20]. We constructed a facial recognition model using the VGG-16 architecture for GS screening. The model proposed in this study achieved high performance, with an accuracy of 0.8860 ± 0.0211 and an AUC of 0.9443 ± 0.0276. The proposed VGG-16 screening model has excellent performance in discriminating GS children with non-GSs, and outperformed all the participating paediatricians with statistical significance. Visual explanations via Grad-CAM can provide insights into dysmorphic facial characteristics. However, limited dataset and indiscernible image details may influence the localization ability of Grad-CAM.
The quality of a CNN model is dependent on the size of the dataset. Due to the low incidence of GSs, the number of dysmorphic facial photographs has often been limited, which risks the deep CNN model overfitting in cases of small datasets. The transfer learning method can solve this problem. Transfer learning is the reuse of a pretrained model on a new problem. It enables researchers to benefit from the knowledge gained from a previously used model for a similar task, analogous to humans’ capacity to use previously acquired knowledge to solve a similar problem . The transfer learning technique has often been used with small sample studies. Zhen et al.  reported research on predicting rectum toxicity in patients receiving radiotherapy for cervical cancer. Transfer learning from substantial natural images has solved the problem of limited data. In the current study, the VGG-16 networks had pretrained weights from the large-scale face dataset “VGG-Face” for learning low-level visual features from the general population. Therefore, the model parameters were fine-tuned by using our facial image dataset and gained knowledge of high-level visual features in GS facial manifestations.
In this study, we gathered 228 cases with 35 different GSs. There are many typical but rare dysmorphic facial images in this facial photograph dataset. These craniofacial alterations can provide clinicians with important diagnostic clues, and an automatic facial recognition model for GS screening can be constructed using these facial images. However, there were several limitations in the study. (1) A good diagnostic model is often based on a sufficiently large and general dataset. As most GSs are rare diseases, the facial photos training set in this study was limited, and it will be beneficial to collect more GS cases. (2) All participants were from East Asia. There were no Caucasian, African, or other ethnic cases enrolled in the study. Facial dysmorphic features may be influenced by ethnic backgrounds. (3) Enrolled children were mainly composed of toddlers or preschool children. Therefore, the proposed model in the current study may not be appropriate for infants, neonates, or adults.
This study highlighted the feasibility of facial recognition technology for GSs identification. The VGG-16 recognition model can play a prominent role in GSs screening in clinical practice.
Availability of data and materials
The datasets supporting the conclusions in this article are primarily included within the article and available from the corresponding authors upon reasonable request.
Convolutional neural networks
Gradient-weight class activation mapping
Rectified linear unit
Receiver operating characteristic
Area under the ROC curve
Visual Geometry Group
Jackson M, Marks L, May GHW, Wilson JB. The genetic basis of disease. Essays Biochem. 2018;62:643–723. https://doi.org/10.1042/EBC20170053.
Solomon BD, Muenke M. When to suspect a genetic syndrome. Am Fam Phys. 2012;86(9):826–33.
Wright CF, FitzPatrick DR, Firth HV. Paediatric genomics: diagnosing rare disease in children. Nat Rev Genet. 2018;19(5):253–68. https://doi.org/10.1038/nrg.2017.116.
Ferreira CR. The burden of rare diseases. Am J Med Genet A. 2019;179(6):885–92. https://doi.org/10.1002/ajmg.a.61124.
Boycott KM, Vanstone MR, Bulman DE, MacKenzie AE. Rare-disease genetics in the era of next-generation sequencing: discovery to translation. Nat Rev Genet. 2013;14:681–91. https://doi.org/10.1038/nrg3555.
Roosenboom J, Hens G, Mattern BC, Shriver MD, Claes P. Exploring the underlying genetics of craniofacial morphology through various sources of knowledge. Biomed Res Int. 2016;2016:3054578. https://doi.org/10.1155/2016/3054578.
Hurst ACE. Facial recognition software in clinical dysmorphology. Curr Opin Paediatr. 2018;30:701–6. https://doi.org/10.1097/MOP.0000000000000677.
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis. 2020;128:336–59. https://doi.org/10.1007/s11263-019-01228-7.
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011;12:77. https://doi.org/10.1186/1471-2105-12-77.
Loos HS, Wieczorek D, Würtz RP, von der Malsburg C, Horsthemke B. Computer-based recognition of dysmorphic faces. Eur J Hum Genet. 2003;11:555–60. https://doi.org/10.1038/sj.ejhg.5200997.
Gurovich Y, Hanani Y, Bar O, Nadav G, Fleischer N, Gelbman D, et al. Identifying facial phenotypes of genetic disorders using deep learning. Nat Med. 2019;25:60–4. https://doi.org/10.1038/s41591-018-0279-0.
Mishima H, Suzuki H, Doi M, Miyazaki M, Watanabe S, Matsumoto T, et al. Evaluation of Face2Gene using facial images of patients with congenital dysmorphic syndromes recruited in Japan. J Hum Genet. 2019;64:789–94. https://doi.org/10.1038/s10038-019-0619-z.
Liehr T, Acquarola N, Pyle K, St-Pierre S, Rinholm M, Bar O, et al. Next generation phenotyping in Emanuel and Pallister-Killian syndrome using computer-aided facial dysmorphology analysis of 2D photos. Clin Genet. 2018;93:378–81. https://doi.org/10.1111/cge.13087.
Hsieh T, Mensah MA, Pantel JT, Aguilar D, Bar O, Bayat A, et al. PEDIA: prioritization of exome data by image analysis. Genet Med. 2019;21(12):2807–14. https://doi.org/10.1038/s41436-019-0566-2.
Pantel JT, Hajjir N, Danyel M, Elsner J, Abad-Perez AT, Hansen P, et al. Efficiency of computer-aided facial phenotyping (DeepGestalt) in individuals with and without a genetic syndrome: diagnostic accuracy study. J Med Internet Res. 2020;22(10): e19263. https://doi.org/10.2196/19263.
Khan HA, Jue W, Mushtaq M, Mushtaq MU. Brain tumor classification in MRI image using convolutional neural network. Math Biosci Eng. 2020;17:6203–16. https://doi.org/10.3934/mbe.2020328.
Lin H, Wei C, Wang G, Chen H, Lin L, Ni M, et al. Automated classification of hepatocellular carcinoma differentiation using multiphoton microscopy and deep learning. J Biophotonics. 2019;12: e201800435. https://doi.org/10.1002/jbio.201800435.
Talo M, Yildirim O, Baloglu UB, Aydin G, Acharya UR. Convolutional neural networks for multi-class brain disease detection using MRI images. Comput Med Imaging Graph. 2019;78:101673. https://doi.org/10.1016/j.compmedimag.2019.101673.
Wachinger C, Reuter M, Klein T. DeepNAT: deep convolutional neural network for segmenting neuroanatomy. Neuroimage. 2018;170:434–45. https://doi.org/10.1016/j.neuro-image.2017.02.035.
Lopez AR, Giró-i-Nieto X, Burdick J, Marques O. Skin lesion classification from dermoscopic images using deep learning techniques. In: 2017 13th IASTED International Conference on Biomedical Engineering (BioMed). 2017. p. 49–54. https://doi.org/10.2316/P.2017.852-053.
Singh A, & Kisku DR. Detection of rare genetic diseases using facial 2D images with transfer learning. In: 2018 8th International Symposium on Embedded Computing and System Design (ISED). 2018. p. 26–30. https://doi.org/10.1109/ISED.2018.8703997.
Zhen X, Chen J, Zhong Z, Hrycushko B, Zhou L, Jiang S, et al. Deep convolutional neural network with transfer learning for rectum toxicity prediction in cervical cancer radiotherapy: a feasibility study. Phys Med Biol. 2017;62:8246–63. https://doi.org/10.1088/1361-6560/aa8d09.
We thank Yan-Qiu Ou, MD (Guangdong Cardiovascular Institute), Xiao-Wei Xu, PhD (Guangdong Provincial Key Laboratory of South China Structural Heart Disease), and Lin Liu, MD (Shenzhen Children’s Hospital) for their contributions.
National Natural Science Foundation of China (Grant No. 82070321) and Sanming Project of Medicine in Shenzhen (CN).
Ethics approval and consent to participate
This study was supported by the National Natural Science Foundation of China (Grant No. 82070321) and Sanming Project of Medicine in Shenzhen (CN). Written informed consent was obtained from each patient or caregiver (age dependent) during the routine clinical appointment.
Consent for publication
All presentations of cases in articles obtained consent for publication from legal guardians.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Hong, D., Zheng, YY., Xin, Y. et al. Genetic syndromes screening by facial recognition technology: VGG-16 screening model construction and evaluation. Orphanet J Rare Dis 16, 344 (2021). https://doi.org/10.1186/s13023-021-01979-y
- Facial recognition
- Genetic syndrome
- Artificial intelligence
- Deep learning
- Rare diseases