Most teams who land on a polygon brief came to it from one of two directions. Either they tried to train an object detector on bounding boxes and the model kept including background as “part of the object” — fashion AI is the textbook case. Or they spec'd full pixel segmentation, paid double, and realised six weeks in that polygon-level fidelity would've been enough.
This guide is the version we wish landed in those briefs on day one — what a polygon actually is, when it's the right tool, the vertex-count discipline that decides whether your project comes in on budget, and how polygon QA actually works. Honest, opinionated, no vendor-deck energy.
What a Polygon Actually Is
A polygon is an ordered list of (x, y) vertices that, joined in sequence, trace the outline of an object. A 12-vertex polygon around a car is twelve points the annotator placed along the car's silhouette. The model learns “everything inside this closed shape is the object” — which, unlike a bounding box, doesn't drag a halo of background into the training signal.
The trade-off is straightforward. A box is four numbers and takes seconds. A polygon is N numbers and takes minutes. The model benefit is real, but only when the object actually has irregular shape — putting a polygon around a rectangular shipping container is paying for accuracy you don't need.
The 30% Rule: When To Reach For a Polygon
Honest test we apply on every incoming brief — if a bounding box around the object would be more than 30% background, a polygon will train a noticeably better model. Less than 30%, the box is fine and you save the money.
- Box is fine — vehicles head-on, parcels in a warehouse, books on a shelf, anything roughly rectangular.
- Polygon wins — garments on a model, fruit on a tree, irregular logos, vehicles at 3/4 view, animals in motion, retail products with curved silhouettes, drone shots of irregular fields and crops.
- Segmentation wins — anything that needs pixel-exact area (lane surface, organ outlines, fashion try-on masks).
For the box-vs-polygon end of this decision, we go deep in the bounding box guide. For the polygon-vs-segmentation end, see the image segmentation guide.
The Vertex-Count Discipline (Where Polygon Projects Live or Die)
The single biggest determinant of polygon project cost and quality is vertex discipline. Two failure modes show up on almost every incoming dataset audit:
- Under-vertexing. Annotators draw 4–6 vertex polygons around objects with curved silhouettes. The result is effectively a tilted box that misses 20% of the actual shape. The model trains barely better than on bounding boxes — and you paid polygon rates.
- Over-vertexing. Annotators put 60–80 vertices on a banana. Looks beautiful on the QA dashboard. Doubles annotation time. Model accuracy is statistically identical to the same shape with 12 vertices. Pure cost burn.
The fix is a vertex budget written into the annotation spec, per class. Cars at 3/4 view — 12–20 vertices. Pedestrians — 16–24. Bananas — 8–12. Crisp-edge logos — corner-driven, no smoothing fluff. Polygons that fall outside the band get flagged in QA and re-traced. Boring, effective, the difference between a project that ships and one that runs over.
Polygon vs Box vs Segmentation: The Honest Cost Comparison
Rough per-image cost ratios we see on production projects (varies with class density and complexity):
- Bounding box — baseline (1x).
- Tight bounding box with high IoU discipline — about 1.3x.
- Polygon (12–20 vertex range) — typically 3–5x.
- 3D cuboid — 6–10x (3D, depth, orientation work).
- Semantic / instance segmentation — 8–15x (pixel-level masks).
The point isn't the exact ratios — they shift with class density and image complexity. The point is that polygon is the right tool for a specific middle band, and projects that drift into polygon territory when boxes were fine, or that drift into segmentation when polygons were enough, are the two cheapest ways to over-pay. Match the tool to the question.
Formats and Tools
COCO JSON is the dominant format — polygons stored as ordered (x, y) vertex lists, indexed against image IDs and category labels. GeoJSON is the geospatial cousin, common for satellite and drone work. Pascal VOC XML with polygon extensions exists in older pipelines.
On the tooling side, polygon-native viewers (CVAT, Label Studio, the commercial suites) all support the basic draw-and-edit. The features that actually matter for production speed — magnetic edge snapping, AI-assisted boundary suggestion, vertex-budget warnings, and bulk class re-assignment — vary widely between tools. Pick on those, not on the polished demo.
Quality: IoU, mAP, and the Vertex-Distance Metric Most People Skip
Standard polygon QA tracks IoU against gold-standard polygons (threshold 0.7–0.8 for production work), per-class mAP on a held-out set, and inter-annotator agreement on the gold. The metric worth adding that most vendors skip — mean vertex-distance error from the gold polygon. IoU alone can mask sloppy tracing where the polygon happens to enclose the right area but the boundary is wandering. Vertex-distance catches that. General framework lives in the annotation QA playbook.
Need polygons done properly?
Free 50-image polygon pilot — vertex-budgeted, per-class accuracy reporting, COCO / GeoJSON output. 48-hour turnaround.
See our polygon annotation serviceWhere Polygons Get Used in Production
- Retail and e-commerce — product detection on cluttered shelves, garment outlines on models, logo recognition. Boxes lose too much background to be useful.
- Agriculture — fruit and crop outlines on drone imagery — see the agriculture annotation guide.
- Aerial and geospatial — buildings, fields, infrastructure assets from drone or satellite imagery. GeoJSON polygons feed straight into GIS pipelines.
- Manufacturing — irregular defects, component outlines for quality inspection.
- Wildlife and conservation — animal outlines on camera-trap imagery, where boxes drag in too much background.
Related Reading
- → Polygon annotation service
- → Bounding box annotation guide
- → Image segmentation annotation guide
- → Instance segmentation service
- → Semantic segmentation service
- → Annotation QA playbook
Get a 50-image polygon pilot in 48 hours
Send a representative sample — we'll deliver vertex-budgeted polygons in COCO or GeoJSON with per-class accuracy on the gold set.
Neel Bennett
AI Annotation Specialist at AI Taggers
Neel has over 8 years of experience in AI training data and machine learning operations. He specializes in helping enterprises build high-quality datasets for computer vision and NLP applications across healthcare, automotive, and retail industries.
Connect on LinkedIn