Ophthalmology AI is having a moment. Diabetic retinopathy screening systems are getting FDA clearance. Glaucoma progression detectors are entering NHS pilots. AMD classifiers are showing up in Australian optometry chains. Saudi Vision 2030's healthcare push includes large-scale fundus-based screening for diabetes complications.
All of this requires labeled training data that meets clinical-grade standards. Not "research-grade" — clinical. The bar is higher than most ML engineers realise on day one, and missing it kills FDA submissions. This guide covers what good ophthalmology annotation looks like.
Diabetic Retinopathy: Pick Your Grading Scale Carefully
DR is the most-targeted ophthalmology AI category. The three grading scales in active use:
- ICDR (International Clinical Diabetic Retinopathy): 5-class severity scale (None, Mild NPDR, Moderate NPDR, Severe NPDR, PDR). The dominant choice for global AI screening systems. Default for FDA submissions.
- NHS DR Grading: R0/R1/R2/R3, M0/M1 for maculopathy. The standard for UK and many Commonwealth health system deployments. Translates cleanly to ICDR with mapping rules.
- ETDRS (Early Treatment Diabetic Retinopathy Study): The most granular scale, with sub-classifications within severe NPDR. Rarely used outside clinical trials.
The recommendation: Annotate ICDR primary, with NHS DR as a derived secondary label. This gives you both scales for downstream flexibility without doubling annotation cost.
A common failure mode: teams pick ETDRS to "future-proof", discover the labeling cost is 2x ICDR with no benefit for screening applications, then re-annotate. Lock the scale before pilot.
Glaucoma: Cup-to-Disc Ratio Is Not Enough
Glaucoma AI projects often start by asking annotators for cup-to-disc ratio (CDR) as a single number per image. That label is useful but lossy — you can't derive structural analysis from it downstream.
Better approach: annotate full optic disc and cup segmentation as polygon masks. CDR derives automatically; you also get cup geometry for asymmetry analysis, neuroretinal rim measurement, and rim-to-disc area ratio. The annotation cost is only marginally higher and the downstream flexibility is large.
For OCT-based glaucoma, RNFL (retinal nerve fiber layer) defect annotation is the standard. Layer segmentation on OCT B-scans, plus thickness map labeling for circumpapillary scans. Specialist OCT-experienced annotators are non-negotiable here.
AMD: Multi-Modal Data Is the Default
Wet vs dry AMD classification on fundus alone gets you part of the way. AMD progression and treatment-response AI require OCT to capture the layer-level changes (RPE atrophy, drusen volume, fluid accumulation) that distinguish AMD stages.
Standard AMD annotation stack:
- Fundus: drusen presence + size class, geographic atrophy boundaries, hemorrhage detection
- OCT: macular layer segmentation, drusen volume (3D), CNV detection if wet AMD
- OCT angiography (if available): CNV vascularity quantification
OCT Layer Segmentation: The Specialist Skill
Macular OCT layer segmentation is one of the most specialised annotation tasks in medical imaging. Twelve retinal layers need to be delineated correctly across B-scans, with consistency across patients and scanners. Annotation errors compound — a misplaced ILM boundary on slice 50 affects the whole volume reconstruction.
Practical requirements: OCT-specialist annotators (not generalist medical annotators), volume-level consistency QA (not just per-slice), and adjudication by retinal subspecialists for ambiguous boundaries. This is one task where crowdsourced annotation produces unusable data even when individual annotators are high-quality.
Pediatric Ophthalmology: ROP Annotation
Retinopathy of Prematurity AI has a distinct annotation stack. ICROP classification (Stage 1-5), Zone identification (1, 2, 3), Plus disease detection (yes/no/pre-plus), and APROP (aggressive posterior ROP) recognition.
Annotation requires pediatric ophthalmology subspecialty knowledge. ROP findings differ in appearance from adult diseases and standard adult-retinal training does not transfer cleanly. For pediatric AI projects, insist on subspecialty-credentialed annotators.
FDA-Ready Annotation: Documentation Matters
If your ophthalmology AI is targeting FDA submission, your annotation documentation needs to support the regulatory file. Specifically:
- Annotator credentials documented and traceable (board certifications, subspecialty)
- Annotation protocol versioned with sign-off from clinical lead
- Inter-annotator agreement (Cohen's κ) per task class
- Adjudication trail for disagreement resolution
- Per-image provenance log (who annotated, when, with what protocol version)
- Quality metric reporting on every delivery batch
Ophthalmology annotation for your project
Free 25-50 image pilot in 72 hours. Board-certified ophthalmologists, protocol-aligned grading, FDA-ready documentation.
See ophthalmology serviceRelated Reading
- → Ophthalmology annotation service
- → Clinical expert annotation (pathology / histology)
- → Radiology annotation
- → Retinal image tagging service
Get an ophthalmology pilot in 72 hours
Send 25-50 fundus or OCT images and we'll match a retinal or glaucoma specialist.
Neel Bennett
AI Annotation Specialist at AI Taggers
Neel has over 8 years of experience in AI training data and machine learning operations. He specializes in helping enterprises build high-quality datasets for computer vision and NLP applications across healthcare, automotive, and retail industries.
Connect on LinkedIn