Freelance agent evaluation analyst

Freelance

Mindrift

Analyst

Posted: 7 February

Offer description

Location restricted: Candidates must be residing in the specified country.
Overview
At Mindrift, we connect domain experts with cutting‑edge AI projects through the Mindrift platform powered by Toloka.
We are seeking curious, intellectually proactive contributors who love challenging assumptions and evaluating AI systems.
About the Project
Short‑term, flexible QA role for autonomous AI agents.
Your work will balance quality assurance, research, and logical problem‑solving to validate and improve complex task structures, policy logic, and agent evaluation frameworks.
Who we're looking for
This opportunity is well‑suited for:
Analysts, researchers, or consultants with strong critical‑thinking skills
Students (senior undergrads or graduate students) seeking an intellectually engaging gig
Individuals open to part‑time, non‑permanent work
Responsibilities
Review evaluation tasks and scenarios for logic, completeness, and realism.
Identify inconsistencies, missing assumptions, or unclear decision points.
Define clear expected behaviors (gold standards) for AI agents.
Annotate cause–effect relationships, reasoning paths, and plausible alternatives.
Think through complex systems and policies to ensure agents are tested properly.
Collaborate with QA, writers, or developers to suggest refinements or edge‑case coverage.
Requirements
Excellent analytical thinking: reason about complex systems and logical implications.
Strong attention to detail: spot contradictions, ambiguities, and vague requirements.
Familiarity with structured data formats (read JSON/YAML).
Ability to assess scenarios holistically: identify missing or unrealistic elements.
Good communication and clear writing (English) to document findings.
Preferred qualifications:
Experience with policy evaluation, logic puzzles, case studies, or structured scenario design.
Background in consulting, academia, olympiads (logic/math/informatics), or research.
Exposure to LLMs, prompt engineering, or AI‑generated content.
Familiarity with QA or test‑case thinking (edge cases, failure modes).
Understanding of scoring/evaluation metrics in agent testing (precision, coverage).
Benefits
Competitive rates up to $50 per hour based on skill and project needs.
Flexible, remote, freelance work that fits around your schedule.
Opportunity to participate in an advanced AI project and enhance your portfolio.
Influence how future AI models understand and communicate in your area of expertise.
How to Apply
Apply to this post, qualify, and seize the chance to contribute on your own schedule.
#J-18808-Ljbffr

Apply

Create an E-mail Alert

Save

Similar job

Remote role: media search analyst - english speaker in ireland

Cork

Freelance

TELUS Digital

Analyst

Similar job

Ai agent evaluation analyst (freelance)

Freelance

Mindrift

Analyst

Similar job

Freelance: media search analyst

Freelance

TELUS Digital

Analyst