Skip to main content Skip to secondary navigation


Main content start

Led by Dr Sophia Wang, our research group’s goal is to perform cutting-edge research that leads at the interface of informatics and ophthalmology, to be able to rapidly and accurately deliver health insights from large volumes of data to improve ophthalmology patient care. Our research group uses and integrates a wide variety of data sources in our research, spanning both structured and unstructured forms, including national survey datasets, health insurance claims data, patient generated online text, and electronic health records. We investigate outcomes of treatments for glaucoma and cataract, as well as other areas of ophthalmology, while developing and applying novel methods for automated extraction of ophthalmic data from free text.

Current Projects

We are currently working on projects in several areas, including the following: 

  • Developing, validating, and applying natural language processing methods for ophthalmology clinical text 
  • Developing artificial intelligence deep learning algorithms on electronic health records data to predict glaucoma progression 
  • Developing artificial intelligence deep learning algorithms on electronic health records data to predict glaucoma surgical outcomes 
  • Developing artificial intelligence deep learning algorithms on electronic health records data to predict visual prognosis of patients with low vision 
  • Developing artificial intelligence computer vision algorithms for cataract surgical videos, to detect the activity being performed, and to detect the identity and location of eye anatomical landmarks and instruments being used 
  • Investigating associations between glaucoma and a variety of risk factors using data from insurance claims and national registries 
  • Investigating glaucoma surgical outcomes using data from insurance claims, national registries, and electronic health records 


Stanford Optima Data Repository

At the center of our group's work lies a robust and expansive repository of Stanford ophthalmic data that fuels our research endeavors. Our curated data collection not only propels advancements in ophthalmic research but also serves as a shared resource, fostering collaboration among diverse partnerships. Our datasets are extensive, encapsulating structured EHR data, comprehensive imaging sets, and rich free-text clinical notes tailored towards advancements in artificial intelligence and ophthalmology. 

Structured EHR Data

Our EHRs holds a wealth of structured data, ranging from eye exam metrics including intraocular pressure, central corneal thickness, visual acuity, and refraction. Beyond that, it is home to demographics data and detailed records of patient medication, diagnoses, and surgical histories.

Imaging Data

We house an extensive range of ophthalmic imaging data. Our repository contains Retinal Nerve Fiber Layer optical coherence tomography (RNFL OCT) PDF reports and their extracted structured numerical data counterparts which include metrics such as average RNFL thickness, cup to disc ratios, quadrant and clock-hour thicknesses, and more. We also have a variety of visual field PDF reports, complete with extracted structured numerical data for 10-2, 24-2, and 30-2 visual fields. Moreover, we house raw fundus photographs and a variety of 2D and 3D optic nerve head, macular cube, and retinal images.

Free-text Clinical Notes 

Our database includes a substantial collection of free-text clinical and progress notes. This data enriches our structured data sets, provides further context for our empirical data, and is instrumental for Natural Language Processing (NLP) research.