Machine learning: A view from new angles

Ophthalmology Times EuropeOphthalmology Times Europe November 2023
Volume 19
Issue 09
Pages: 16 - 17

More testing is needed to ensure accuracy across different groups

A hand balances an energy sphere. Image credit: ©Who is Danny –

Investigators in a recent study found that reproducibility across various data sets was poor in different ethnic groups for detecting glaucoma with machine learning. Image credit: ©Who is Danny –

Machine learning (ML) seems to be the wave of the future in medicine and, when perfected, it will be a most valuable diagnostic asset. However, right now the technology is in its infancy and the kinks have to be worked out and diagnostic capabilities perfected.

Investigators in a recent study1 found that ML is still wanting, in that the reproducibility across various data sets was poor in different ethnic groups for detecting glaucoma, according to senior author Damon Wong, PhD, from the Singapore Eye Research Institute (SERI), Singapore National Eye Centre; SERI-Nanyang Technological University Advanced Ocular Engineering; and School of Chemical and Biomedical Engineering, Nanyang Technological University, all in Singapore; and the Institute of Molecular and Clinical Ophthalmology, Basel, Switzerland.

This result is in contrast to that of other studies2-6 that used ML approaches to detect glaucoma. While most of the studies reported high diagnostic accuracies (area under the receiver operating curve [AUC] = 0.88-0.98) for glaucoma detection, they did not assess the models with independently sampled data from a different ethnicity group (external test), which limits the generalizability of the models across ethnicities,7 Dr Wong and colleagues explained.

A quote which reads "Reproducibility across data sets was poor in different ethnic groups for detecting glaucoma with machine learning."

In light of this deficiency, the investigators conducted a prospective, cross-sectional study in which they wanted to externally validate the ability of ML models to detect glaucoma using optical coherence tomography (OCT) images. The study included 514 Asian patients (257 with glaucoma and 257 controls without glaucoma) who were enrolled to con­struct ML models for glaucoma detection. The models then were evaluated in 356 Asian patients (183 with glaucoma and 173 controls without glaucoma) and also in 138 Caucasian patients (57 with glaucoma and 81 controls without glaucoma).

The retinal nerve fiber layer (RNFL) thickness values were used in the study; they were produced by the compensation model, which the authors described as a multiple regression model fitted on healthy participants that corrects the RNFL profile for anatomic factors and the original OCT data (measured) to build two classifiers, respectively.

Data evaluation

With the exception of the foveal distance (P = .029), the investigators found no significant differences between the training dataset and the Asian test dataset (P ≥ .174).

They found significant differences in the demographic data between the training dataset and the Caucasian test dataset; the participants in the external test dataset were younger, more were female, and more eyes had mild and moderate glaucoma.

In addition, the ocular characteristics also differed between those two datasets; specifically, fewer Caucasians had significantly shorter fovea distances, smaller foveal angles, less elliptical optic discs (ratio closer to 1.0), and thicker retinal vessel densities (P ≤ .009), the authors reported.

In the glaucoma dataset, the Caucasians had significantly fewer elliptical optic discs, higher optic disc orientations and thicker retinal vessel densities. They were more hyperopic, and had greater RNFL thicknesses (P ≤ .001).

A quote which reads, "To the best of our knowledge, our study is the first to assess the performance of ML classifiers to detect glaucoma between ethnicities.”

“Both the ML models (AUC = .96 and accuracy = 92%) outperformed the measured data (AUC = .93; P < .001) for glaucoma detection in the Asian dataset. However, in the Caucasian dataset, the ML model trained with compensated data (AUC = .93 and accuracy = 84%) outperformed the ML model trained with original data (AUC = .83 and accuracy = 79%;
P < .001) and measured data (AUC = 0.82; P < .001) for glaucoma detection,” investigators reported.

In commenting on their findings, Dr Wong and colleagues said, “The results showed poor reproducibility of the performance with the ML model trained on original RNFL data across different datasets. In contrast, the performance of the ML model trained on compensated RNFL seemed to be maintained. To the best of our knowledge, our study is the first to assess the performance of ML classifiers to detect glaucoma between ethnicities.”

This next step of evaluating the ML performance in different ethnic groups is the next critical step in the process to determine the model’s generalisability to other populations, they explained, and advised that care must be taken be exercised in cohorts of patients representing different ethnic groups.


1. Li C, Chua J, Schwarzhans F, et al. Assessing the external validity of machine learning‐based detection of glaucoma. Sci Rep. 2023; 13(1):558. doi:10.1038/s41598-023-27783-1
2. Wang P, Shen J, Chang R, et al. Machine learning models for diagnosing glaucoma from retinal nerve fiber layer thickness maps. Ophthalmol Glaucoma. 2019; 2(6):422-428. doi:10.1016/j.ogla.2019.08.004
3. An G, Omodaka K, Hashimoto K, et al. Glaucoma diagnosis with machine learning based on optical coherence tomography and color fundus images. J Healthc Eng. 2019;2019:4061313. doi:10.1155/2019/4061313
4. Kim SJ, Cho KJ, Oh S. Development of machine learning models for diagnosis of glaucoma. PLoS ONE. 2017; 12(5):e0177726. doi:10.1371/journal.pone.0177726
5. An G, Omodaka K, Tsuda S, et al. Comparison of machine-learning classification models for glaucoma management. J HealthEng. 2018;2018:6874765.doi:10.1155/2018/6874765
6. Oh S, Park Y, Cho KJ,Kim SJ. Explainable machine learning model for glaucoma diagnosis and its interpretation. Diagnostics (Basel). 2021;11(3):510. doi:10.3390/diagnostics11030510
7. Ramspek CL, Jager KJ, Dekker FW, Zoccali C, van Diepen M. External validation of prognostic models: What, why, how, when and where? Clin Kidney J. 2020;14(1):49-58. doi:10.1093/ckj/sfaa188
Related Videos
ARVO 2024: Andrew D. Pucker, OD, PhD on measuring meibomian gland morphology with increased accuracy
 Allen Ho, MD, presented a paper on the 12 month results of a mutation agnostic optogenetic programme for patients with severe vision loss from retinitis pigmentosa
Noel Brennan, MScOptom, PhD, a clinical research fellow at Johnson and Johnson
ARVO 2024: President-elect SriniVas Sadda, MD, speaks with David Hutton of Ophthalmology Times
Elias Kahan, MD, a clinical research fellow and incoming PGY1 resident at NYU
Neda Gioia, OD, sat down to discuss a poster from this year's ARVO meeting held in Seattle, Washington
Eric Donnenfeld, MD, a corneal, cataract and refractive surgeon at Ophthalmic Consultants of Connecticut, discusses his ARVO presentation with Ophthalmology Times
John D Sheppard, MD, MSc, FACs, speaks with David Hutton of Ophthalmology Times
Paul Kayne, PhD, on assessing melanocortin receptors in the ocular space
Osamah Saeedi, MD, MS, at ARVO 2024
© 2024 MJH Life Sciences

All rights reserved.