Skip to main content

Table 4 Description and evaluation of the 6 sets of patients

From: Next generation phenotyping using narrative reports in a rare disease clinical data warehouse

Sets

RETT

DOCK8 deficiency

LOWE

SILVER RUSSELL

BARDET BIEDL

APDS 1 and 2

Median age at visit (years)

8.2 [4.8–12.6]

11.4 [9.3–14.1]

12.8 [5.8–20.3]

2.4 [0.8–5.4]

15.7 [10.1–41.5]

12.8 [7.7–18.6]

Median follow up (years)

2.6 [0–4.9]

3.1 [0.3–9]

6.6 [3–10.3]

2 [0.8–4.7]

2 [0.1–6.6]

7.5 [4.8–8.6]

# Patients

209

15

23

50

53

23

# Documents

5034

3296

1325

1133

1317

2337

Phenotypes extracted, not negated and in patient context

# Phenotypes

18,538

6886

5281

6563

6345

9716

# distinct Phenotypes

1022

706

577

738

801

710

Evaluation by experts in the Top50 phenotypes

Medical Experts

NBB

CP

RS

JA

RS

NM

# Phenotypes ranked by Freq

31

36

36

16

17

39

# Phenotypes ranked by TF-IDF

38

37

41

11

12

37

# Phenotypes Freq union TF-IDF

42

52

50

16

19

52

# Phenotypes Freq intersect TF-IDF

28

22

28

11

11

25

Average Precision, ranked by Freq

0.86

0.91

0.88

0.55

0.66

0.83

Average Precision, ranked by TF-IDF

0.91

0.84

0.90

0.49

0.52

0.83