What is polygon annotation?

Polygon annotation is the labelling of an object in an image by tracing its outline with a sequence of vertices, producing a closed polygon that follows the actual shape of the object. Unlike a bounding box (always a rectangle), a polygon expresses irregular shapes — garments, logos, fruit, vehicles at angles — without including the background that a box would. It is the right tool when shape matters and pixel-level segmentation would be overkill.

When does polygon annotation beat a bounding box?

Whenever the bounding box would include a lot of background. Garments on a model, irregular logos, fruit and animals with non-rectangular silhouettes, vehicles at oblique angles, retail products in cluttered scenes. The honest test — if a box around the object would be more than 30% background, a polygon trains a better model. If it would be less than 30%, the box is fine and cheaper.

When does polygon annotation beat segmentation?

When pixel-perfect masks aren't needed and per-vertex polygons are accurate enough. Polygons are cheaper than segmentation per image (typically 30-60% less) and faster to annotate. The dividing line — if your model needs exact area or boundary pixels (lane surface, organ outlines, fashion try-on), use segmentation. If shape category is enough (product detection, logo recognition, fruit counting), polygons are the right call.

How many vertices should a polygon have?

Enough to capture the object's actual silhouette, no more. A car at 3/4 view takes 12-20 vertices to trace properly; a banana takes 8-12; a logo with sharp corners takes one vertex per corner plus a few smoothing points on curves. Annotators left without a vertex budget either under-spec (boxy 4-vertex polygons that train no better than a bounding box) or over-spec (60+ vertices on simple shapes, doubling cost for no model gain). Lock the vertex discipline in the spec.

What formats are used for polygon annotation?

COCO JSON is the dominant format — polygons stored as ordered (x, y) vertex lists, indexed alongside categories and image IDs. GeoJSON is common for geospatial polygons. Some pipelines use Pascal VOC XML with polygon extensions, though it's less native. Whatever format you pick, lock it before annotation starts — converting polygon coordinates across formats is reliable but tedious, and round-trip precision losses are real on long vertex lists.

How is polygon annotation priced?

Two common models — per-polygon (more common, simpler to scope) or per-vertex (more accurate for complex shapes, more annoying to forecast). Most enterprise contracts settle on per-polygon with a vertex-count band — e.g. one rate for 4-15 vertex polygons, a higher rate for 16-40, and a custom rate above 40. The cost drivers are vertex count, occlusion density, and how tight the tracing tolerance is.

How is polygon annotation quality measured?

Intersection over Union (IoU) against a gold-standard polygon, typically thresholded at 0.7-0.8 for production work. Mean Average Precision (mAP) per class on a held-out set. Inter-annotator agreement on the gold set. A polygon-specific metric worth tracking — vertex-distance error against the gold (mean distance of annotator vertices from the nearest gold vertex), which catches sloppy tracing that IoU alone misses.

Polygon Annotation: When Polygons Beat Bounding Boxes (2026 Guide)

Most teams who land on a polygon brief came to it from one of two directions. Either they tried to train an object detector on bounding boxes and the model kept including background as “part of the object” — fashion AI is the textbook case. Or they spec'd full pixel segmentation, paid double, and realised six weeks in that polygon-level fidelity would've been enough.

This guide is the version we wish landed in those briefs on day one — what a polygon actually is, when it's the right tool, the vertex-count discipline that decides whether your project comes in on budget, and how polygon QA actually works. Honest, opinionated, no vendor-deck energy.

What a Polygon Actually Is

A polygon is an ordered list of (x, y) vertices that, joined in sequence, trace the outline of an object. A 12-vertex polygon around a car is twelve points the annotator placed along the car's silhouette. The model learns “everything inside this closed shape is the object” — which, unlike a bounding box, doesn't drag a halo of background into the training signal.

The trade-off is straightforward. A box is four numbers and takes seconds. A polygon is N numbers and takes minutes. The model benefit is real, but only when the object actually has irregular shape — putting a polygon around a rectangular shipping container is paying for accuracy you don't need.

The 30% Rule: When To Reach For a Polygon

Honest test we apply on every incoming brief — if a bounding box around the object would be more than 30% background, a polygon will train a noticeably better model. Less than 30%, the box is fine and you save the money.

Box is fine — vehicles head-on, parcels in a warehouse, books on a shelf, anything roughly rectangular.
Polygon wins — garments on a model, fruit on a tree, irregular logos, vehicles at 3/4 view, animals in motion, retail products with curved silhouettes, drone shots of irregular fields and crops.
Segmentation wins — anything that needs pixel-exact area (lane surface, organ outlines, fashion try-on masks).

For the box-vs-polygon end of this decision, we go deep in the bounding box guide. For the polygon-vs-segmentation end, see the image segmentation guide.

The Vertex-Count Discipline (Where Polygon Projects Live or Die)

The single biggest determinant of polygon project cost and quality is vertex discipline. Two failure modes show up on almost every incoming dataset audit:

Under-vertexing. Annotators draw 4–6 vertex polygons around objects with curved silhouettes. The result is effectively a tilted box that misses 20% of the actual shape. The model trains barely better than on bounding boxes — and you paid polygon rates.
Over-vertexing. Annotators put 60–80 vertices on a banana. Looks beautiful on the QA dashboard. Doubles annotation time. Model accuracy is statistically identical to the same shape with 12 vertices. Pure cost burn.

The fix is a vertex budget written into the annotation spec, per class. Cars at 3/4 view — 12–20 vertices. Pedestrians — 16–24. Bananas — 8–12. Crisp-edge logos — corner-driven, no smoothing fluff. Polygons that fall outside the band get flagged in QA and re-traced. Boring, effective, the difference between a project that ships and one that runs over.

Polygon vs Box vs Segmentation: The Honest Cost Comparison

Rough per-image cost ratios we see on production projects (varies with class density and complexity):

Bounding box — baseline (1x).
Tight bounding box with high IoU discipline — about 1.3x.
Polygon (12–20 vertex range) — typically 3–5x.
3D cuboid — 6–10x (3D, depth, orientation work).
Semantic / instance segmentation — 8–15x (pixel-level masks).

The point isn't the exact ratios — they shift with class density and image complexity. The point is that polygon is the right tool for a specific middle band, and projects that drift into polygon territory when boxes were fine, or that drift into segmentation when polygons were enough, are the two cheapest ways to over-pay. Match the tool to the question.

Formats and Tools

COCO JSON is the dominant format — polygons stored as ordered (x, y) vertex lists, indexed against image IDs and category labels. GeoJSON is the geospatial cousin, common for satellite and drone work. Pascal VOC XML with polygon extensions exists in older pipelines.

On the tooling side, polygon-native viewers (CVAT, Label Studio, the commercial suites) all support the basic draw-and-edit. The features that actually matter for production speed — magnetic edge snapping, AI-assisted boundary suggestion, vertex-budget warnings, and bulk class re-assignment — vary widely between tools. Pick on those, not on the polished demo.

Quality: IoU, mAP, and the Vertex-Distance Metric Most People Skip

Standard polygon QA tracks IoU against gold-standard polygons (threshold 0.7–0.8 for production work), per-class mAP on a held-out set, and inter-annotator agreement on the gold. The metric worth adding that most vendors skip — mean vertex-distance error from the gold polygon. IoU alone can mask sloppy tracing where the polygon happens to enclose the right area but the boundary is wandering. Vertex-distance catches that. General framework lives in the annotation QA playbook.

Need polygons done properly?

Free 50-image polygon pilot — vertex-budgeted, per-class accuracy reporting, COCO / GeoJSON output. 48-hour turnaround.

See our polygon annotation service

Where Polygons Get Used in Production

Retail and e-commerce — product detection on cluttered shelves, garment outlines on models, logo recognition. Boxes lose too much background to be useful.
Agriculture — fruit and crop outlines on drone imagery — see the agriculture annotation guide.
Aerial and geospatial — buildings, fields, infrastructure assets from drone or satellite imagery. GeoJSON polygons feed straight into GIS pipelines.
Manufacturing — irregular defects, component outlines for quality inspection.
Wildlife and conservation — animal outlines on camera-trap imagery, where boxes drag in too much background.

Polygon Annotation: When Polygons Beat Bounding Boxes (and When They Don't)