Back to Jobs
Mindrift

Freelance AI Evaluation Engineer (Python/Full-Stack)

Mindrift
Location
Job Type
Contract
Salary
$40 per hour
Posted
3/23/2026
Career Level
Mid-Senior Level
Qualification
Degree in Computer Science, Software Engineering or related fields
Remote5+ years in software development1 views

Job Description

What this opportunity involves

  • Create challenging coding test cases that push AI coding systems to their limits
  • Review and refine realistic coding tasks based on provided production codebases with realistic scope, requirements and information sources
  • Write comprehensive functional tests that validate actual end-to-end behavior and edge-cases, not just superficial checks
  • Craft “fair but hard” challenges where the AI has all the context it needs, but has to work for it (information scattered across files and external sources, complex reasoning required)
  • Analyze AI failures to understand what the model struggles with vs. what it masters
  • Iterate based on feedback from expert QA reviewers who score your work on 7 quality criteria

What we look for

  • Degree in Computer Science, Software Engineering or related fields
  • 5+ years in software development, primarily Python (pytest, async/await, subprocess, file operations)
  • Background in Full-Stack development, with an equal focus on building React-based interfaces and robust Back-end systems
  • Experience writing tests (functional, integration – not just running them)
  • Docker containers (running evaluations locally in containers)
  • CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results)
  • English proficiency - B2

How it works

  • Apply → Pass qualification(s) → Join a project → Complete tasks → Get paid

Effort estimate

Tasks for this project are estimated to take 20 hours to complete, depending on complexity.

Compensation

On this project, contributors can earn up to $40 per hour equivalent, depending on their level and pace of contribution.

Get notified of similar jobs

We'll send you an email when jobs similar to "Freelance AI Evaluation Engineer (Python/Full-Stack)" are posted.

Keyword: Freelance AI Evaluation Engineer (Python/Full-Stack)Location: Kuwait

No spam ever. Unsubscribe with one click anytime. By subscribing, you agree to our privacy policy.

HomeJobsSign In