LiDAR is the sensor that gives autonomous vehicles and robots a true 3D view of the world: a spinning or solid-state laser fires millions of pulses a second and measures how long each takes to return, producing a point cloud — a dense set of 3D points with precise distances. That point cloud is gold for perception, but a raw point cloud teaches a model nothing. It has to be annotated first.
Point cloud annotation is one of the most technically demanding label types in AI, and one of the easiest to get quietly wrong. This guide covers what gets annotated, how single-frame and sequence (4D) labelling differ, the formats and tools in use in 2026, the quality metrics that matter, and what to look for when you're choosing a point cloud annotation company.
LiDAR Annotation vs Point Cloud Annotation
First, a terminology note, because the queries get used interchangeably. Point cloud annotation is the general term for labelling any 3D point set. LiDAR annotation specifies that the points came from a LiDAR sensor — by far the most common source. Point clouds can also come from radar, depth cameras, or photogrammetry, so all LiDAR annotation is point cloud annotation, but not the reverse. Everything in this guide applies to both.
The Four Annotation Types on Point Clouds
- 3D cuboids: a 7-DOF bounding box per object — position, dimensions, and heading. The default for object detection and tracking. (We cover this in depth in the 3D cuboid annotation guide.)
- Semantic segmentation: every point assigned a class — road, sidewalk, vehicle, vegetation, building, pole. This is “3D box annotation for object detection”'s more granular cousin, used for driveable-surface and scene understanding.
- Instance segmentation: per-point class and object identity, so two adjacent cars are distinct instances, not one blob of “vehicle” points.
- 3D polylines: lane lines, road edges, and curbs traced in 3D for HD-map building and lane-keeping.
Most AV datasets use cuboids for dynamic objects and segmentation for static scene structure. Which you need is driven by the model: a detector wants cuboids; an occupancy or driveable-area network wants per-point segmentation.
Single-Frame vs Sequence (4D) Annotation
A single LiDAR sweep is one frame. But objects move, and perception runs on sequences — so most real datasets are 4D: 3D space plus time.
In 4D annotation, each object keeps a stable track ID across every frame, and boxes are interpolated between keyframes so the same pedestrian is “object 23” from frame 1 to frame 200. This is where cost and difficulty concentrate: a single frame is straightforward; maintaining identity and geometry through 200 frames of a busy intersection, with objects appearing, occluding each other, and leaving, is the real work. It's also where cheap vendors cut corners — ID switches and drifting boxes between keyframes are the classic failure signature.
Formats: KITTI, nuScenes, Waymo — and LAS for Geospatial
- KITTI: the original AV benchmark — per-frame labels with 3D dimensions, location, and yaw. Camera-frame coordinates catch people out.
- nuScenes: relational JSON built for 360° multi-sensor scenes with proper cross-time tracking. The modern default for AV.
- Waymo Open Dataset: protocol buffers with 7-DOF labels and tracking IDs; the most rigorous and heaviest to tool.
- PCD / .bin: common containers for the raw point cloud itself.
- LAS / LAZ: the standard for geospatial and surveying point clouds — relevant if your work overlaps geospatial annotation.
Lock the target format before labelling. Yaw sign and coordinate frame differ between these standards, and naive conversion can flip every heading — a bug that's invisible until your model trains on it.
Tools and the Sensor-Fusion Advantage
Point cloud tooling lives or dies on a few features: ground-plane fitting, one-click cuboid snapping to clusters, brush-based per-point segmentation, and cross-frame interpolation. Open-source options like SUSTechPOINTS and CVAT's 3D mode cover the basics; commercial suites add scale and fusion.
The biggest quality lever is sensor fusion: showing the LiDAR point cloud and the synchronised camera image side by side. The point cloud gives measured geometry; the camera tells the annotator whether that cluster is a parked car or a bin. Annotating in fused views catches errors that neither sensor reveals alone — provided the calibration (extrinsics) is correct, or annotators will “fix” good boxes to match a misaligned image.
Quality Metrics That Matter
- 3D IoU at 0.5 / 0.7 thresholds for cuboids; volumetric overlap with gold standards.
- mAP per class, computed separately because cars, pedestrians, and cyclists behave very differently.
- Orientation error (AOS / AOE) — the heading metric amateurs fail.
- Per-point accuracy / mIoU for segmentation tasks.
- Track continuity — ID-switch rate across sequences, the 4D-specific metric.
Looking for a 3D point cloud annotation company?
Free pilot in 72 hours. LiDAR cuboids, per-point segmentation, and 4D tracking in KITTI / nuScenes / Waymo / LAS, with sensor fusion and per-batch 3D IoU and orientation QA.
See our LiDAR annotation serviceHow to Choose a Point Cloud Annotation Company
If you're evaluating vendors for 3D LiDAR work, judge them on more than a per-object rate:
- 3D-specific tooling — cluster snapping, ground-plane fitting, and interpolation, not a 2D tool bolted onto point clouds.
- Your target format — proven KITTI / nuScenes / Waymo / LAS delivery, with correct coordinate and yaw conventions.
- Sensor fusion — if you have camera data, they should use it.
- Transparent QA — per-batch 3D IoU, orientation, and track-continuity reporting, plus a real inter-annotator-agreement process.
- A pilot on your hardest data — night, rain, dense traffic. A firm quote sight-unseen is a guess.
Our broader checklist on vendor selection — pricing models, governance, and red flags — is in how to choose a data annotation company.
Where Point Cloud Annotation Gets Used
- Autonomous vehicles & ADAS: the dominant use case — detection, segmentation, and tracking for self-driving perception.
- Robotics: navigation, obstacle avoidance, and manipulation for warehouse and service robots.
- Geospatial & surveying: aerial and terrestrial LiDAR for terrain, infrastructure, and vegetation mapping.
- Smart cities & traffic: fixed-rig intersection monitoring and flow analysis.
- Construction & mining: site modelling, volume calculation, and equipment safety zones.
Related Reading
- → LiDAR & point cloud annotation service
- → 3D cuboid annotation service
- → 3D cuboid annotation guide
- → Geospatial & satellite annotation
- → Annotation for autonomous vehicles
Get a LiDAR annotation pilot in 72 hours
Send a short point-cloud sequence and we'll return tracked cuboids or per-point segmentation in your target format.
Neel Bennett
AI Annotation Specialist at AI Taggers
Neel has over 8 years of experience in AI training data and machine learning operations. He specializes in helping enterprises build high-quality datasets for computer vision and NLP applications across healthcare, automotive, and retail industries.
Connect on LinkedIn