Category 01 / Annotation & Labeling

The labelled data your model actually trusts.

High-volume, enterprise-grade annotation across three tracks — computer vision, NLP & text, and LiDAR & 3D. Trained operators, written accuracy commitments, the same QA discipline on every batch.

ISO 27001 Certified · Cert No. 452AGI102121
Operators
500+Trained & schema-calibrated
Tracks
ThreeCV · NLP · LiDAR / 3D
Tooling
YoursWe fit your stack, not ours
Scale
2M+Records on one account

Operating model

Annotation isn’t volume. It’s judgment at volume.

01 / Specialisation
Operators specialise by track — not swapped across CV and NLP mid-batch
02 / Calibration
Every operator passes a gold-set before production access
03 / Convention
Your schema, your taxonomy, your edge-case book — we adopt, not invent
04 / QA inline
Multi-tier review runs continuously, not at batch end

The three tracks

One operation. Three specialisations.

Each track runs with its own operator pool, schema conventions, and reviewer seniority — sharing the same QA discipline and written accuracy commitments.

Techniques, end to end

What we actually deliver.

The full technique surface across the three tracks. Every technique runs under the same review pipeline and the same written accuracy threshold.

CV / 01

Bounding box

Tight 2D boxes with class labels. Occlusion and truncation flags as part of the schema.

Use · detection, ADAS
CV / 02

Polygonal

Instance-level outlines — pixel-accurate boundaries for irregular shapes.

Use · retail, agri, medical
CV / 03

Semantic segmentation

Per-pixel class masks. Drivable area, vegetation, person, vehicle, infrastructure.

Use · AV, scene understanding
CV / 04

Keypoint & pose

Skeletal landmarks for body, face, hand. Fixed topologies or client-defined skeletons.

Use · fitness, AR, robotics
CV / 05

Video tracking

Multi-object ID persistence across frames, re-entry handling, interpolation with keyframes.

Use · surveillance, sports, AV
CV / 06

3D cuboid on 2D

2D-projected 3D boxes for orientation-aware detection when LiDAR is unavailable.

Use · mono-vision AV
NLP / 01

Named entity (NER)

Span-level entity tagging — domain ontologies including legal, medical, financial.

Use · search, extraction, RAG
NLP / 02

Intent & sentiment

Utterance-level classification with multi-label support and schema-defined confidence bands.

Use · CX, agent routing
NLP / 03

Content moderation

Policy-trained labellers for toxicity, harm categories, platform-specific content rules.

Use · trust & safety, LLM RLHF
3D / 01

Point-cloud cuboid

Oriented 3D boxes in LiDAR space with class, heading, and size.

Use · AV, robotics
3D / 02

Sensor fusion

LiDAR + camera co-registered labels. Consistent identity across sensor modalities.

Use · AV stacks
3D / 03

3D segmentation

Per-point semantic labels. Ground, drivable, static, dynamic classes.

Use · perception, mapping

Tooling & pipeline fit

We fit your stack. You keep your tooling, your schema, and your data.

The operation is tool-agnostic. We work inside the platform your data team already uses — or inside a secured, air-gapped environment when required. No pressure to migrate to something proprietary.

01 / Platforms

Your labeling tool of choice

CVAT, Label Studio, V7, Labelbox, Encord, Scale, SuperAnnotate, Roboflow, or internal tools. We train to the interface you already run.

02 / Environment

On-prem or air-gapped options

For regulated data — medical, defense, financial — operators work inside a secured environment with audited access.

03 / Data flow

Your schema, your taxonomy

We adopt your edge-case book and convention guide as the source of truth. No inventive re-interpretation mid-batch.

04 / Handover

Structured delivery, audit-ready

Batch manifests, per-operator attribution, IAA reports, and exception logs travel with the data on every handoff.

Engagement process

From pilot to production, measured at every step.

The sequence we’ve run on every annotation engagement. Pilot to steady-state in weeks, not quarters.

Step 01

Scope & schema

Workshop your schema, edge cases, and target accuracy threshold. Confirm tooling and security envelope.

Step 02

Gold-set build

Co-draft a gold-reference set with your data lead. Becomes the calibration and arbitration benchmark.

Step 03

Pilot batch

Calibrated operators run a pilot. IAA measured; schema tightened with you before scaling.

Step 04

Production ramp

Scale headcount against agreed throughput. The same review pipeline as pilot — not a compressed version.

Step 05

Steady state

Weekly IAA, exception, and drift reports. Schema amendments versioned. Rework on us if we miss the floor.

What the record looks like

Built for scale. Measured at every batch.

500+
Trained operators

Across the three tracks — specialised by data type, not rotated mid-engagement.

2M+
Records on one account

Production throughput for our longest-running annotation engagement. Multi-year, multi-pipeline.

0.94
Average IAA

Inter-annotator agreement averaged across production accounts. Reported weekly.

6
Production AI clients

Computer vision, NLP, and multimodal. Repeat engagements — not one-off pilots.

ISO 27001 Certified / Cert No. 452AGI102121 / 500+ operators · Nairobi

Scope with us

Send us a sample batch. We’ll return a pilot plan.

For any of the three tracks, we scope operators, schema work, and a written accuracy threshold together — and run a pilot before you commit to steady-state volume.

Not sure which track? Start with the three.