to add suitable information for you.

Mindrift

Evaluation Scenario Writer – AI Agent Testing Specialist

Job Information

Location

Saudi Arabia

Career Level

Mid-Senior Level

Employee Type

PART_TIME

Job Category

AI/Machine Learning

Experience

2-5 years

Salary

$41/hour

Date posted

Job Description

Evaluation Scenario Writer at Mindrift - AI Agent Testing

Mindrift is looking for an Evaluation Scenario Writer to join our team as an AI Agent Testing Specialist. In this role, you’ll design realistic and structured evaluation scenarios for LLM-based agents, contributing to the ethical shaping of AI. If you’re passionate about AI and possess a strong analytical mindset, this is an excellent opportunity to leverage your skills.

Crafting Effective AI Agent Testing Scenarios

As an Evaluation Scenario Writer, your primary responsibility will be creating test cases that simulate human-performed tasks. You’ll define gold-standard behavior, ensuring each scenario is clearly defined, well-scored, and easy to execute and reuse. You will need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions. Learn more about AI Testing.

Key Responsibilities:

  • Designing structured test scenarios based on real-world tasks for AI Agent Testing.
  • Defining the golden path and acceptable agent behavior.
  • Annotating task steps, expected outputs, and edge cases.
  • Working with devs to test your scenarios and improve clarity.
  • Reviewing agent outputs and adapting tests accordingly.

Ensuring Quality in AI Agent Testing

Your expertise as an Evaluation Scenario Writer will ensure the quality and reliability of AI agents. You’ll be responsible for defining the golden path, which includes acceptable agent behavior, and annotating task steps to clarify expected outputs and edge cases. Your efforts will contribute significantly to refining model responses and improving overall AI performance.

Qualifications for the Evaluation Scenario Writer Role

  • Bachelor’s and/or Master’s Degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems or other related fields.
  • Background in QA, software testing, data analysis, or NLP annotation.
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases).
  • Strong written communication skills in English.
  • Comfortable with structured formats like JSON/YAML for scenario description.
  • Can define expected agent behaviors (gold paths) and scoring logic.
  • Basic experience with Python and JS.
  • Curious and open to working with AI-generated content, agent logs, and prompt-based behavior.
  • You are ready to learn new methods, able to switch between tasks and topics quickly and sometimes work with challenging, complex guidelines.

Mindrift provides a flexible, remote, freelance project that fits around your primary professional or academic commitments. This position as an Evaluation Scenario Writer, lets you take part in an advanced AI project and gain valuable experience to enhance your portfolio. Influence how future AI models understand and communicate in your field of expertise. More on LLMs.

Check out some example test scenarios.

Additional Information

Career Level

Mid-Senior Level

Years of Experience

2-5 years

Qualification

Bachelor's/Master's Degree

Employee Type

PART_TIME

Job Category

AI/Machine Learning

Educational Background

Computer Science, Software Engineering, Data Science, AI/ML, NLP, or related

Company Overview

Mindrift connects domain experts with cutting-edge AI projects from innovative tech clients, powered by Toloka. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe.

Additional Information Company

Company Size

51-200 employees

Founded Year

Industry

Artificial Intelligence

Specialties

AI, GenAI, Machine Learning, Data Science

Website

Headquarter

Share this job :

Related Jobs

Lead Process Design Engineer – (Oil & Gas Industry)

Eram Talent

Guest Service Associate (Seasonal)

AccorHotel

Villa Attendant

AccorHotel