AI Safety Expert - Red Team
$20–$22 per hour
$20–$22 per hour
Role Responsibilities • Red team conversational AI models and agents to identify jailbreaks, prompt injections, misuse cases, and bias exploitation. • Generate high-quality human data by annotating failures, classifying vulnerabilities, and flagging systemic risks. • Apply structure by following taxonomies, benchmarks, and playbooks to maintain consistent testing. • Document reproducibly by producing reports, datasets, and attack cases that customers can act on. • Review AI outputs on sensitive topics like bias, misinformation, or harmful behaviors, with optional participation in higher-sensitivity projects.
Qualifications Must-Have
• Fluent Language Skills Required: English & Assamese • Prior red teaming experience in AI adversarial work, cybersecurity, or socio-technical probing. • Ability to push systems to breaking points with a curious and adversarial mindset. • Structured approach using frameworks or benchmarks. • Strong communication skills to explain risks to technical and non-technical stakeholders. • Adaptability to move across projects and customers.
Preferred • Experience with Adversarial ML, Cybersecurity, and socio-technical risk. • Skills in creative probing, such as psychology, acting, or writing for unconventional adversarial thinking.
Application Process (Takes 20–30 mins to complete) • Upload resume • AI interview based on your resume • Submit form
Resources & Support • For details about the interview process and platform information, please check: https://talent.docs.mercor.com/welcome • For any help or support, reach out to: support@mercor.com
PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity. Originally