Freelance AI Evaluation Engineer (Python/Full-Stack)

Mindrift

Location

Kuwait,Kuwait

Job Type

Contract

Salary

$40 per hour

Posted

4/9/2026

Career Level

Mid-Senior Level

Qualification

Degree in Computer Science, Software Engineering or related fields

Remote5+ years in software development13 views

Job Description

What this opportunity involves

You’ll create challenging coding test cases that push AI coding systems to their limits:
Review and refine realistic coding tasks based on provided production codebases with realistic scope, requirements and information sources
Write comprehensive functional tests that validate actual end-to-end behavior and edge-cases, not just superficial checks
Craft “fair but hard” challenges where the AI has all the context it needs, but has to work for it (information scattered across files and external sources, complex reasoning required)
Analyze AI failures to understand what the model struggles with vs. what it masters
Iterate based on feedback from expert QA reviewers who score your work on 7 quality criteria

What we look for

Degree in Computer Science, Software Engineering or related fields
5+ years in software development, primarily Python (pytest, async/await, subprocess, file operations)
Background in Full-Stack development, with an equal focus on building React-based interfaces and robust Back-end systems
Experience writing tests (functional, integration – not just running them)
Docker containers (running evaluations locally in containers)
CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results)
English proficiency - B2

How it works

Apply → Pass qualification(s) → Join a project → Complete tasks → Get paid

Effort estimate

Tasks for this project are estimated to take 20 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.

Compensation

On this project, contributors can earn up to $40 per hour equivalent, depending on their level and pace of contribution.Compensation varies across projects depending on scope, complexity, and required expertise. Please note that other projects on the platform may offer different earning levels based on their requirements.

Get notified of similar jobs

We'll send you an email when jobs similar to "Freelance AI Evaluation Engineer (Python/Full-Stack)" are posted.

Related Jobs You Might Like

View all jobs →

Freelance Full-Stack Web App Developer

Mindrift

KuwaitRemote

Part-time

Not specified

About MindriftMindrift is a platform connecting specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleThis is a freelance role for the Tendem project. As a Full-Stack Web App Developer (AI Pilot), you will design, build, and refine browser-based applications with real logic, state, persistence, and user input — ranging from habit trackers and budgeting tools to internal dashboards, mini-SaaS tools, and AI-powered apps. You may also work on standalone Python applications and data-processing scripts.Key ResponsibilitiesBuild interactive web applications with frontend (React, Next.js, Vue, or similar) and a backend API (Python/FastAPI/Flask or Node/Express).Design and implement data models, schemas, and persistence layers using SQL (PostgreSQL, SQLite) or NoSQL stores.Implement authentication, sessions, and basic role-based access where needed.Integrate third-party APIs and AI/LLM services (OpenAI, Anthropic, or similar) into product features.Handle state management, user input validation, error states, and loading states cleanly.Build standalone Python tools and scripts where required (data processing, API clients, lightweight backend utilities).Evaluate AI-generated full-stack code and refactor it for correctness, security, performance, and maintainability.Write clear, testable code and debug end-to-end issues across frontend, backend, and database.RequirementsAt least 3 years of relevant experience in full-stack web development or shipping interactive web applications (required).Bachelor's or Master's Degree in Computer Science, Engineering, Information Technology, or related technical fields is a plus.Strong command of JavaScript/TypeScript and at least one modern frontend framework (React, Next.js, Vue, Svelte, or similar).Solid backend experience in Python (FastAPI, Flask, Django) and/or Node.js (Express, NestJS).Hands-on experience with relational databases (PostgreSQL, MySQL, SQLite) and basic schema design.Experience implementing REST APIs, request validation, error handling, and authentication flows.Familiarity with deployment platforms (Vercel, Netlify, Render, Fly.io, Railway, or similar).Experience integrating LLM APIs or other AI services into product features is a strong plus.Comfortable with version control (Git) and basic testing practices.Portfolio of shipped web applications (required).Strong attention to detail and commitment to building working, robust products.Self-directed work ethic with the ability to architect, build, and ship features independently.Strong English communication skills.BenefitsPart-time remote flexibility.Work on cutting-edge AI projects with major tech innovators.Global collaboration with specialists across the world.Professional growth in AI-assisted software engineering.

View Details →

Freelance Mobile App Developer (iOS / Android)

Mindrift

KuwaitRemote

Part-time

Not specified in posting

About MindriftMindrift is looking for skilled Mobile App Developers (React Native, Flutter, Swift, or Kotlin) to join the Tendem project (tendem.ai) and build native and cross-platform mobile applications within our hybrid AI + human environment. In this role, as an AI Pilot, you'll collaborate with Tendem Agents that handle repetitive tasks, while you provide mobile engineering expertise, platform-specific judgment, and quality control to ensure apps are stable, performant, and ready for real users on real devices.This part-time remote opportunity is ideal for professionals with hands-on experience shipping iOS and/or Android apps, working with platform APIs, and handling the full mobile development lifecycle.What We DoThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleThis is a freelance role for a Tendem project. As a Mobile App Developer, you'll design, build, and refine mobile applications across categories such as utilities, fitness/wellness, games, productivity, delivery, and content apps — for iOS, Android, or both, using native or cross-platform frameworks.Key ResponsibilitiesBuild mobile applications using React Native, Flutter, Swift (iOS), or Kotlin (Android)Implement responsive mobile UIs that follow platform conventions (iOS HIG, Material Design)Integrate native device features (camera, push notifications, location, storage, biometrics)Connect apps to backend APIs, handle offline state, caching, and synchronizationImplement monetization features where required (in-app purchases, ads, subscriptions)Evaluate AI-generated mobile code and refactor it for correctness, performance, battery use, and maintainabilityDebug platform-specific issues and prepare builds for distribution (TestFlight, Play Console)RequirementsAt least 3 years of relevant experience in mobile app development (required)Bachelor's or Master's Degree in Computer Science, Engineering, Information Technology, or related technical fields is a plusHands-on experience with at least one of: React Native, Flutter, Swift/SwiftUI (iOS), or Kotlin/Jetpack Compose (Android)Solid understanding of mobile UI patterns, navigation, state management, and platform guidelinesExperience integrating REST APIs, handling async data, and managing local storageFamiliarity with native device APIs (notifications, camera, location, storage, biometrics)Experience with mobile build tools, code signing, and submission to App Store / Google PlayStrong attention to detail and commitment to performance, stability, and platform polishSelf-directed work ethic with the ability to ship complete mobile features independentlyPortfolio of shipped mobile apps (required, with App Store / Google Play links preferred)English proficiency: Upper-intermediate (B2) or above (required)Nice to HaveExperience implementing in-app purchases, ads, or subscriptionsFamiliarity with backend services such as Firebase, Supabase, or similarProject Time ExpectationsTasks are estimated to require around 10–20 hours per week during active phases, based on project requirements. This is an estimate, not a guaranteed workload, and applies only while the project is active.

View Details →

Chatbot Developer (WhatsApp, Telegram, Discord) - Freelance

Mindrift

KuwaitRemote

Contract

2,500-5,500 USD per month (Estimated)

Mindrift is looking for skilled Bot Developers (WhatsApp Business API, Telegram Bot API, Discord API) to join the Tendem project (https://tendem.ai/) and build conversational bots and messaging-platform integrations within our hybrid AI + human environment. In this role, as an AI Pilot – that's how we refer to this position at Mindrift – you'll collaborate with Tendem Agents that handle repetitive tasks, while you provide bot engineering expertise, conversational design judgment, and quality control to ensure bots are reliable, useful, and ready for real users. This part-time remote opportunity is ideal for professionals with hands-on experience building messaging bots, working with platform APIs and webhooks, and implementing conversational logic.What We DoThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleThis is a freelance role for a Tendem project. As a Bot Developer, you'll design, build, and refine messaging bots for one or more messaging platforms, including WhatsApp, Telegram, Discord, Slack, and similar platforms — for use cases such as customer service, appointment booking, order taking, content delivery, moderation, and automated notifications.Key ResponsibilitiesBuild bots for one or more messaging platforms, such as WhatsApp (Business API / Cloud API), Telegram (Bot API), Discord, Slack and similar messaging platforms.Design and implement conversational flows, dialogue state, and fallback handling.Integrate bots with LLMs (OpenAI, Anthropic, or similar) for natural language responses where appropriate.Connect bots to backend services, databases, CRMs, and third-party APIs (booking systems, payment, content sources).Handle webhooks, rate limits, and platform-specific message formats (interactive messages, buttons, media, templates).Evaluate AI-generated bot code and refactor it for correctness, reliability, and graceful error handling.Implement logging, monitoring, and recovery so bots stay healthy in production.Requirements and BenefitsEducational qualificationsAt least 3 years of relevant experience backend, integration, automation, or bot development experience (required).Bachelor's or Master's Degree in Computer Science, Engineering, Information Technology, or related technical fields is a plus.Academic and/or Professional ExperienceCandidates should have a strong foundation in bot development, messaging platform integrations, and building reliable conversational workflows. We are looking for specialists who can design and maintain production-ready bots, work confidently with APIs, webhooks, and backend services, and refine AI-assisted output into stable, user-friendly experiences. Strong problem-solving skills, attention to detail, and the ability to work independently are essential.Technical Skills (Essential)At least 1 year of hands-on experience building bots for at least one major messaging platforms (WhatsApp, Telegram, Discord, Slack, or similar) is requiredStrong command of Python or Node.js for backend bot logic.Solid experience with REST APIs, webhooks, OAuth, and async request handling.Experience with relational or NoSQL databases for storing conversation state and user data.Familiarity with LLM APIs (OpenAI, Anthropic) and prompt design for conversational use is a strong plus.Understanding of platform-specific limits, message templates, and approval flows (e.g., WhatsApp template messages).Experience with hosting and deployment (Docker, serverless, VPS, or PaaS)Additional requirementsStrong attention to detail and commitment to bot reliability — no silent failures, no broken flows.Self-directed work ethic with the ability to design and ship complete bots independently.Portfolio or examples of bots you've built (required).English proficiency: Upper-intermediate (B2) or above (required).

View Details →