Category 01 / Annotation & Labeling
High-volume, enterprise-grade annotation across three tracks — computer vision, NLP & text, and LiDAR & 3D. Trained operators, written accuracy commitments, the same QA discipline on every batch.
Operating model
The three tracks
Each track runs with its own operator pool, schema conventions, and reviewer seniority — sharing the same QA discipline and written accuracy commitments.
Bounding boxes, polygons, keypoints, semantic and instance segmentation, video tracking. The highest-volume track — autonomy, retail analytics, medical imaging, agri-tech.
Entity tagging, span annotation, intent & sentiment, reranking, policy moderation, multi-turn conversation labels. Domain-trained reviewers for legal, medical, and financial text.
Point-cloud cuboids, sensor fusion with camera, 3D semantic segmentation, object tracking across frames. Reviewers trained on autonomous-driving and robotics schemas.
Techniques, end to end
The full technique surface across the three tracks. Every technique runs under the same review pipeline and the same written accuracy threshold.
Tight 2D boxes with class labels. Occlusion and truncation flags as part of the schema.
Instance-level outlines — pixel-accurate boundaries for irregular shapes.
Per-pixel class masks. Drivable area, vegetation, person, vehicle, infrastructure.
Skeletal landmarks for body, face, hand. Fixed topologies or client-defined skeletons.
Multi-object ID persistence across frames, re-entry handling, interpolation with keyframes.
2D-projected 3D boxes for orientation-aware detection when LiDAR is unavailable.
Span-level entity tagging — domain ontologies including legal, medical, financial.
Utterance-level classification with multi-label support and schema-defined confidence bands.
Policy-trained labellers for toxicity, harm categories, platform-specific content rules.
Oriented 3D boxes in LiDAR space with class, heading, and size.
LiDAR + camera co-registered labels. Consistent identity across sensor modalities.
Per-point semantic labels. Ground, drivable, static, dynamic classes.
Tooling & pipeline fit
The operation is tool-agnostic. We work inside the platform your data team already uses — or inside a secured, air-gapped environment when required. No pressure to migrate to something proprietary.
CVAT, Label Studio, V7, Labelbox, Encord, Scale, SuperAnnotate, Roboflow, or internal tools. We train to the interface you already run.
For regulated data — medical, defense, financial — operators work inside a secured environment with audited access.
We adopt your edge-case book and convention guide as the source of truth. No inventive re-interpretation mid-batch.
Batch manifests, per-operator attribution, IAA reports, and exception logs travel with the data on every handoff.
Engagement process
The sequence we’ve run on every annotation engagement. Pilot to steady-state in weeks, not quarters.
Workshop your schema, edge cases, and target accuracy threshold. Confirm tooling and security envelope.
Co-draft a gold-reference set with your data lead. Becomes the calibration and arbitration benchmark.
Calibrated operators run a pilot. IAA measured; schema tightened with you before scaling.
Scale headcount against agreed throughput. The same review pipeline as pilot — not a compressed version.
Weekly IAA, exception, and drift reports. Schema amendments versioned. Rework on us if we miss the floor.
What the record looks like
Across the three tracks — specialised by data type, not rotated mid-engagement.
Production throughput for our longest-running annotation engagement. Multi-year, multi-pipeline.
Inter-annotator agreement averaged across production accounts. Reported weekly.
Computer vision, NLP, and multimodal. Repeat engagements — not one-off pilots.
Scope with us
For any of the three tracks, we scope operators, schema work, and a written accuracy threshold together — and run a pilot before you commit to steady-state volume.