Job Details

Remote | Expert Professors - Professional Domains - $70-$95/hour

  2026-04-14     24-MAG LLC     San Francisco,CA  
Description:

About the job Remote | Expert Professors - Professional Domains - $70-$95/hour

We are sharing a specialised part-time consulting opportunity for current or retired professors across finance, accounting, law, and other professional services domains with strong domain expertise, structuredreasoning ability, and the ability to design and evaluate challenging real-world tasks for frontier AI systems.

This role supports an exciting collaboration with a leading AI lab focused on improving frontier models through high-quality benchmark task design, golden solution development, model evaluation, and analysis of reasoning and problem-solving gaps in coding and agentic workflows.

Selected professionals will design domain-specific benchmark tasks, prepare detailed specifications and golden solutions in an agentic development environment, evaluate cross-model performance, identify reasoning failures, analyze agent trajectories, and help improve overall model quality. This opportunity is especially well-suited to highly analytical academic experts who are comfortable translating professional domain knowledge into structured evaluation tasks that reflect real-world complexity and executable testing standards.

Key Responsibilities

Professionals in this role may contribute to:

Task Design & Development
Design challenging, real-world domain-specific problems that serve as the foundation for agentic tasks
Construct problems to target specific capability and reasoning failures in frontier AI models
Help ensure that tasks are robust, realistic, and suitable for rigorous evaluation workflows

Specification & Golden Solution Generation
Integrate problems into an agentic development environment using Python
Prepare detailed task instructions, overviews, and golden solutions
Contribute domain-specific consultation and feedback to support high-quality task development

Evaluation & Model Analysis
Evaluate model performance across designed tasks
Identify tasks where the target model fails to pass all tests, particularly where failures reflect logical reasoning gaps
Analyze agent trajectories to extract core capability loss patterns and support model improvement

Ideal Profile

Strong candidates may have:
Current or retired professor experience in finance, accounting, law, or other professional services domains
A degree in finance, accounting, law, or a closely related field
Ability to engage reliably for at least 30 hours per week during weekdays
Basic ability to work independently and manage time effectively
Strong verbal and written communication, problem solving, and interpersonal skills

Preferred qualifications

Past experience in AI training, model evaluation, or data annotation
Ability to translate domain expertise into structured benchmark and evaluation tasks
Comfort working with Python in an agentic development environment
Strong consistency and precision in evaluating reasoning and problem-solving workflows

Why This Opportunity

Contribute specialised academic and professional domain expertise to a cutting-edge AI collaboration
Help shape the next generation of frontier AI tools through benchmark design and reasoning evaluation
Work on high-impact tasks with strong real-world and research relevance
Structured remote work with competitive hourly compensation

Contract Details

W2 employment position with Cincinnatus LLC
Contingent remote role
Hourly compensation of $70-$95 per hour
Open to candidates located in the United States
Expected commitment of at least 30 hours per week during weekdays, including at least 6 hours per day on weekdays
Opportunity to be placed at a leading AI lab as part of its extended workforce
Role-based position with structured collaboration and integration into standard enterprise workflows
Employment, onboarding, payroll, benefits, and compliance are administered by Cincinnatus LLC
Start date: Immediate

About the Platform

This opportunity is available through a leading AI-driven work platform that connects domain experts with frontier AI research projects.
Experts contribute to improving advanced AI systems by providing specialised expertise across real-world workflows, structured evaluation, model training support, and domain-specific content validation.
By submitting this application, you acknowledge that your information may be processed by 24-MAG LLC for recruitment and opportunity matching in accordance with our Privacy Policy:


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search