Back to Jobs
Mindrift

Freelance Agent Evaluation Engineer

Mindrift
Location
Job Type
Contract
Salary
USD 24k-80k (Based on $40/hour) (Estimated)
Posted
2/9/2026
Career Level
Mid-Senior Level
Qualification
Software Engineer
Remote3+ years of software development experience22 views

Job Description

What this opportunity involves

  • Create structured test cases that simulate complex human workflows
  • Define gold-standard behavior and scoring logic to evaluate agent actions
  • Analyze agent logs, failure modes, and decision paths
  • Work with code repositories and test frameworks to validate your scenarios
  • Iterate on prompts, instructions, and test cases to improve clarity and difficulty
  • Ensure that scenarios are production-ready, easy to run, and reusable

What we look for

  • 3+ of software development experience with strong Python focus
  • Experience with Git and code repositories
  • Comfortable with structured formats like JSON/YAML for scenario description
  • Understanding core LLM limitations (hallucinations, bias, context limits) and how these affect evaluation design
  • Familiarity with Docker
  • English proficiency - B2

How it works

  • Apply → Pass qualification(s) → Join a project → Complete tasks → Get paid

Project time expectations

Tasks for this project are estimated to take 6-10 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.

Payment

Paid contributions, with rates up to $40/hour*

Fixed project rate or individual rates, depending on the project

Some projects include incentive payments

*Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project.

Get notified of similar jobs

We'll send you an email when jobs similar to "Freelance Agent Evaluation Engineer" are posted.

Keyword: Freelance Agent Evaluation EngineerLocation: Oman

No spam ever. Unsubscribe with one click anytime. By subscribing, you agree to our privacy policy.

HomeJobsSign In