Performance and clinical utility of a new supervised machine-learning pipeline in detecting rare ciliopathy patients based on deep phenotyping from electronic health records and semantic similarity

Table 3 Performance of each classifier combined with each semantic similarity method in the test set

Similarity	Method	Mean sens.* (IC 95%)	Recall @1% (%)	Recall @10% (%)	Precision @1% (%)	Precision @10% (%)	Mean AUROC (IC 95%)	Mean AUPRC (IC 95%)
Baseline	RidgeReg	82% [76–88]	59	81	25	3.4	93% [91–95]	46% [36–56]
	SVM	81% [75–87]	54	81	23	3.4	91% [89–93]	45% [36–54]
	RF	78% [73–83]	55	78	23	3.3	91% [89–93]	47% [39–55]
	XGBoost	82% [76–88]	61	82	25	3.4	93% [91–95]	42% [33–51]
Lin similarity	RidgeReg	65% [57–73]	23	65	10	2.7	85% [82–88]	6.0% [4.0–8.0]
	SVM	68% [60–76]	32	62	13	2.6	84% [81–87]	15% [9–21]
	RF	76% [68–84]	44	76	18	3.2	90% [87–93]	21% [13–29]
	XGBoost	63% [51–75]	18	62	8	2.6	85% [81–89]	6.0% [3.0–9.0]
Restricted hier. sim	RidgeReg	76% [71–81]	49	76	20	3.2	92% [90–94]	30% [21–39]
	SVM	71% [54–88]	43	72	18	3.0	84% [72–96]	31% [23–39]
	RF	85% [79–91]	59	85	25	3.5	93% [90–96]	43% [35–51]
	XGBoost	86% [80–92]	41	86	17	3.6	96% [94–98]	35% [23–47]
fastText Embd	RidgeReg	81% [76–86]	48	79	20	3.3	90% [86–94]	35% [25–45]
	SVM	81% [76–86]	52	79	22	3.3	91% [88–94]	33% [23–43]
	RF	76% [70–82]	50	77	21	3.2	89% [86–92]	30% [20–40]
	XGBoost	77% [70–84]	38	77	16	3.2	88% [83–93]	19% [13–25]
CODER Embd	RidgeReg	84% [78–90]	57	86	24	3.6	91% [88–94]	35% [25–45]
	SVM	78% [72–84]	53	77	22	3.2	89% [86–92]	40% [30–50]
	RF	74% [68–80]	48	74	20	3.1	88% [85–91]	35% [23–47]
	XGBoost	72% [66–78]	39	72	16	3.0	87% [83–91]	19% [12–26]

ISSN: 1750-1172