Artificial intelligence as patient resource: findings from a chatbot investigation

News
Article

Ophthalmologists graded AI bots' answers to questions from patients following cataract surgery

A humanoid robot texts on a phone. Image credit: ©boyhey – stock.adobe.com

Eid and colleagues concluded that readability and accuracy are still issues for AI-generated patient education materials. Image credit: ©boyhey – stock.adobe.com

With the use of artificial intelligence (AI) growing exponentially in health care, it is important to determine how accurate, safe and usable the technology is in clinical practice.

Kevin Eid, MD, MSc, and colleagues addressed specific questions about AI’s performance regarding patient questions following cataract surgery. Generally, the technology did a good job providing information about commonly asked questions by the two bots analysed, ChatGPT and Google’s Bard (now called Gemini), with specific differences observed between them.

Eid is a postdoctoral ophthalmic pathology/research fellow, Intermountain Ocular Research Center (Mamalis/Werner Lab), John A. Moran Eye Center, University of Utah Health, Salt Lake City. He presented the results of their study at the American Society of Cataract and Refractive Surgery annual meeting in Boston, Massachusetts.

Study focus

The investigators wanted to know if a large language model such as the ones under discussion provide safe, accurate and readable information to patient inquiries about cataract surgery.

Previous studies have reported that chatbots were effective in a few areas: improving the readability of ophthalmology patient education materials,1 creating postoperative instructions in ophthalmology,2 and answering Ophthalmic Knowledge Assessment Programme questions.3

Study protocol

The investigators searched Quora and found 209 patient questions about cataract surgery. They asked the questions of ChatGPT and Bard, and ophthalmologists and ophthalmologists-in-training graded the answers for accuracy and safety, with 1 indicating extremely poor; 2, poor to mediocre; 3, good; and 4, excellent, Eid explained.

ChatGPT-4 was used in this study instead of ChatGPT.

Readability was assessed using the SMOG index (Simple Measure of Gobbledygook, a readability formula commonly associated to the Gunning Fog Index Readability Formula or FOG Index).

The SMOG index was also applied on a scale of 1 to 4, with the scores corresponding to reading levels: these were 4, middle school reading level; 3, high school; 2, college; and 1, graduate school. The American Medical Association recommends patient education materials be at a middle-school reading level, while most widely available patient education materials in ophthalmology are at a college reading level.

Sample questions and answers

Eid provided the following examples of questions and responses.

Q: How long does it take to see clearly after cataract surgery?

A: “Most patients experience significant improvement in their vision within the first few days after cataract surgery. However, it can take several weeks for your vision to fully stabilise and reach its optimal level. During this time, you may still experience some blurriness, sensitivity to light or halos around lights. This is normal and part of the healing process.”

Q: Can an optometrist perform cataract surgery?

A: “No, optometrists are not qualified to perform cataract surgery. This is a specialised procedure that requires extensive training and experience in ophthalmology. Your optometrist can diagnose cataracts and monitor your vision, but they will refer you to an ophthalmologist for surgery.”

Q:Can the lens placed in your eye during cataract surgery dislodge?

A: “IOLs are securely implanted within the eye and have a very low risk of dislodging. They’re designed to withstand everyday activities and physical impact. However, if you experience sudden vision loss or eye pain, contact your doctor immediately to rule out any potential complications.”

Accuracy/safety/readability scores

Eid reported that ChatGPT yielded 13 responses with at least an accuracy or safety grade below 3, and Bard yielded 29 responses, a difference that reached significance (P = .01).

An average of about 10% of answers were considered poor.

The readability scores for ChatGPT and Bard were, respectively, 2.15 and 2.10, which did not differ significantly (P =.42).

The resultant respective cumulative averages, 3.04 and 3.05, also did not differ significantly (P = .77).

Eid and colleagues concluded that ChatGPT and Bard are promising tools for answering common patient questions about cataract surgery. The former was more consistent in its answers, and the latter provided better good answers and worse bad answers. Readability is still an issue for patient education materials.

Finally, the programmes provided some inaccurate or unsafe answers at a cumulative average rate of about 10% of the time, which is considered concerning.

References:

  1. Eid K, Eid A, Wang D, Raiker RS, Chen S, Nguyen J. Optimizing ophthalmology patient education via ChatBot-generated materials: readability analysis of AI-generated patient education materials and the American Society of Ophthalmic Plastic and Reconstructive Surgery Patient Brochures. Ophthalmic Plast Reconstr Surg. 2024;40(2):212-216. doi:10.1097/IOP.0000000000002549
  2. Nanji K, Yu CW, Wong TY, et al. Evaluation of postoperative ophthalmology patient instructions from ChatGPT and Google Search. Can J Ophthalmol. 2024;59(10):e69-e71; doi:10.1016/j.jcjo.2023.10.001
  3. Mihalache A, Popovic MM, Muni RH. Performance of an artificial intelligence chatbot in ophthalmic knowledge assessment. JAMA Ophthalmol. 2023;141(60):589-597. doi:10.1001/jamaophthalmol.2023.1144

Kevin Eid, MD, MSc | E: kevin.eid@utah.edu

Eid is a postdoctoral ophthalmic pathology/research fellow, Intermountain Ocular Research Center (Mamalis/Werner Lab), John A. Moran Eye Center, University of Utah Health, Salt Lake City. He has no financial interest in this subject matter.

Recent Videos
Durga Borkar, MD, MMCi, discusses FAS inhibition with ONL-1204
Theodore Leng, MD, MS, speaks about 12-Month Real-World Clinical and Anatomical Outcomes With Faricimab in Patients With Diabetic Macular Edema:The FARETINA-DME Study
Rishi P. Singh, MD, discussed his presentation on the results from part 1 of the Phase 2/3 SIGLEC trial assessing AVD-104 for GA
Carl C. Awh, MD, FASRS, speaks with Hattie Hayes of Ophthalmology Times Europe
Carl J. Danzig, MD
Martin Zinkernagel, MD, PhD, speaks about the ASRS sustainability expert panel
Srinivas Sai Kondapalli, MD, discusses outer retinal tubulations in lesion growth for subfoveal and non-subfoveal GA
Jennifer I. Lim, MD, FARVO, FASRS
ARVO 2024: Andrew D. Pucker, OD, PhD on measuring meibomian gland morphology with increased accuracy
© 2024 MJH Life Sciences

All rights reserved.