4TB. That's how much voice data was stolen from Mercor, a hiring platform used by AI labs to recruit contractors for model training work. Roughly 40,000 people who recorded their voices as part of paid AI jobs had that audio taken in the breach.
Mercor connects companies building AI systems with the human workers those systems need - data labelers, prompt writers, voice sample recorders, the people who do the manual work that makes AI models function. The platform claims a base of over 300,000 registered workers globally.
What 4TB of Voice Data Actually Means
At a standard recording quality of around 1MB per minute, 4TB is roughly 66,000 hours of audio - about 1.6 hours per affected contractor on average. These aren't quick test clips. They're detailed voice profiles covering pitch, cadence, accent, and speech patterns.
Voice data is biometric. Unlike a compromised password, you cannot change your voice. Current AI voice cloning tools can produce convincing synthetic audio from as little as a few seconds of source material. With hours of audio per person, anyone with access to these stolen recordings has what they need to impersonate 40,000 individuals in phone calls, audio messages, or automated scam operations.
The FTC has flagged AI voice cloning as a growing fraud vector. Voice authentication is still used by banks, insurers, and call centers as an identity check. The 40,000 affected contractors now carry a permanent vulnerability that doesn't reset.
Who's Exposed
These aren't employees at AI companies with enterprise security teams. They're gig workers and freelancers who signed up to earn money doing voice recording tasks - often through platforms where providing audio is simply part of the application process. They had no particular reason to scrutinize the security posture of the platform storing their recordings.
According to a breach report published by Oravys, the stolen data includes contractor profiles alongside voice samples, meaning recordings can be tied to real identities.
Mercor has not published a detailed post-mortem as of this writing. How 4TB of data left the platform, whether the exfiltration happened gradually or in a single event, and whether affected workers have received direct notification remain unanswered.
The Contractor Data Gap
The AI training economy runs on contractors handling sensitive tasks at scale: transcription, image labeling, voice recording, preference ranking. This workforce generates the data that makes AI systems work, but it's rarely afforded the data protections of full-time employment. This breach is a direct consequence of that gap.
Contractors who recorded voice samples through Mercor should disable voice-based account recovery on banking and other sensitive accounts where possible. Voice authentication is not a safe second factor for anyone whose audio has been compromised in a breach of this scale.