Job Description

  • Language: en_NZ (resource must be New Zealand with expert-level in English)
  • Task Scope has two:
  • Consulting adversarial prompts to test the LLM
  • Adversarial Prompt Engineering
  • Design targeted attack prompts across risk categories: harmful content, bias, misinformation, privacy violations, policy circumvention etc
  • Apply diverse attack strategies leveraging including role-playing, multi-turn manipulation, chain-of-thought techniques and more.
  • Conduct systematic adversarial attacks to identify security vulnerabilities, safety gaps, and operational risks in GenAI systems
  • Execute multi-vector attack scenarios including prompt injection, jailbreaking, context manipulation, and edge case exploitation
  • Native linguists need to review system responses and fix verification.
  • Review system responses to identify harmful content, safety violations, and policy breaches
  • Classify...

Apply for this Position

Ready to join OneForma? Click the button below to submit your application.

Submit Application