Track 02 / NLP & Text

Judgment-at-volume for language data.

Entities, spans, intent, sentiment, moderation, preference labels. Domain-trained reviewers for legal, medical, and financial text. Classical NLP and LLM-era workflows under the same discipline.

ISO 27001 Certified · Cert No. 452AGI102121
Sample · NER Span Tagging ● LIVE Named entity recognition annotation with colored spans over a news sentence — ORG, LOC, DATE, PER, WEAPON entity labels.
Techniques
SixNER, intent, span, moderation, preference, rerank
Languages
MultiEnglish + African & European partner langs
Domains
FourLegal, medical, financial, general
LLM workflows
SupportedRLHF pref, red-team, rerank

Where we’ve delivered

Reviewers who know the domain.

A legal span annotator is not labelling medical notes. Domain-trained reviewer pools keep judgment anchored.

Domain 01

Legal

Contract clause tagging, case-citation extraction, obligation & risk span labelling. Reviewer pool with legal training.

NER · span · relation
Domain 02

Medical

Clinical notes, drug mentions, condition / procedure tagging under reviewer oversight. PHI-safe environment.

NER · relation · intent
Domain 03

Financial

Earnings transcripts, instrument extraction, regulatory filings. Multi-language support for EMEA & African exchanges.

NER · sentiment · rerank
Domain 04

General / LLM

Preference labels, red-team prompts, instruction quality, content policy. For foundation-model labs and LLM-app teams.

Preference · moderation · rerank

Schema specifics

What lives in an NLP schema we adopt.

Text schemas fail in predictable places. These are the surfaces most often worth tightening before a pilot runs clean.

01 / Nested spans

Overlap conventions.

Rules for when entities can nest (PERSON inside ORG) vs. when they must be flat.

02 / Tokenisation

Where the boundary falls.

Punctuation, possessives, hyphenation. A consistent boundary rule or drift is inevitable.

03 / Multi-label

Priority tiebreaks.

When multiple classes apply, which wins for single-label downstream tasks.

04 / Abstain

"Don’t know" handled.

Explicit abstain class rather than forcing a decision — carries more signal.

05 / Policy severity

Dual-axis confidence.

Severity and reviewer confidence as separate axes for moderation workflows.

06 / Rubric ties

Rationale vocabulary.

Constrained rationale codes so preference-label rationales cluster cleanly for analysis.

Questions we get

NLP annotation, answered plainly.

Q 01

Do you support languages beyond English?

Yes — English plus selected African and European partner languages. The operator pool varies by language; we scope language coverage at kick-off.

Q 02

Can you do RLHF / preference labelling for foundation models?

Yes — pairwise and k-way preference, red-team prompts, instruction quality. Rubric-tied rationales rather than free-text so labels cluster for reward-model training.

Q 03

How are legal and medical reviewers qualified?

Domain-trained reviewer pools with specific qualifications per engagement — verified at kick-off. Medical work runs under PHI-safe environment and supervision.

Q 04

What tooling do you use?

Yours. Label Studio, Prodigy, Brat, V7 text, Labelbox, Scale, or internal tools. We train to the platform you already run.

Q 05

Do you handle content moderation at platform scale?

Yes — policy-document calibration, dual-axis severity/confidence, explicit escalation path for high-severity calls. Scope is set per engagement.

Other annotation tracks

Text is one modality. Your dataset may be many.

NLP & text sits alongside two other tracks under the same QA discipline and the same operating model.

Scope with us

Send us a text sample. We’ll return a pilot plan.

We scope operators, rubric work, and a written agreement threshold together — and run a pilot against a co-drafted rubric before you commit to steady-state volume.