Technical Program Manager, Frontier Evals

OpenaiOpenai·Remote(San Francisco)
Project Management

WFA Digital Insight

As the demand for AI research professionals continues to grow, with a notable 25% increase in 2025, roles like Technical Program Manager at Openai stand out. This position requires a unique blend of technical program management, research operations, and data analysis skills. With the remote job market booming, professionals with strong digital skills are in high demand. Openai's commitment to ensuring AI benefits humanity makes this role particularly interesting. Candidates should be prepared to demonstrate their ability to work in a fast-paced, research-driven environment and have a strong understanding of large language models. Before applying, it's essential to consider the need for relocation to San Francisco and the requirement for 3 days of in-office work per week.

Job Description

About the Role

The Technical Program Manager position at Openai's Frontier Evals team is a hybrid role that combines technical program management with hands-on research and development. This team is responsible for designing and building evaluations that measure the capabilities and limitations of Openai's most advanced models. The successful candidate will drive high-priority evaluation and research programs from concept to execution and analysis, working closely with researchers, engineers, and data teams.

As a key member of the Frontier Evals team, you will be responsible for managing projects from initial research questions to delivered benchmarks. This will involve partnering with researchers and engineers to translate ambiguous model capability questions into concrete evaluation designs, success metrics, timelines, and execution plans. Your ability to ramp quickly on unfamiliar topics and turn open-ended research questions into concrete plans will be crucial to success in this role.

The role is based in San Francisco, CA, and requires 3 days of in-office work per week. Openai offers relocation assistance to new employees, making this an attractive opportunity for those looking to join a leading AI research and deployment company.

What You Will Do

  • Manage frontier evaluation projects from initial research questions to delivered benchmarks
  • Partner with researchers and engineers to translate ambiguous model capability questions into concrete evaluation designs, success metrics, timelines, and execution plans
  • Design and manage human data campaigns, including task design, trainer or expert instructions, and quality control workflows
  • Perform hands-on technical work where needed, including prompt iteration, model-based evaluation workflows, data analysis, lightweight scripting, dashboarding, and debugging evaluation pipelines
  • Build roadmaps and operating rhythms that keep fast-moving research efforts aligned and unblocked
  • Coordinate across research, engineering, human data, product, safety, legal, external vendors, and domain experts to deliver high-quality evaluations under tight timelines
  • Ramp quickly on new domains and project areas, identifying what needs to be learned, who needs to be involved, and whatever is required to complete the project
  • Communicate clearly with technical and non-technical stakeholders, especially when explaining tradeoffs, uncertainty, quality risks, and research findings
  • Learn new technical domains quickly and enjoy context switching across multiple high-priority projects

What We Are Looking For

  • Experience in technical program management, research operations, data operations, evaluation, or a similarly ambiguous technical execution role
  • Proficiency in Python, SQL, or similar tools to analyze datasets, inspect model outputs, automate workflows, and unblock yourself without waiting on engineering support
  • Strong understanding of how large language models work, including prompting, model evaluation, grading, and common failure modes
  • Ability to quickly turn vague research goals into clear plans, crisp milestones, owners, risks, and decision points
  • Relentless resourcefulness, finding partial, scrappy, technically sound ways to make progress while helping teams build more scalable systems over time
  • Excellent communication skills, both technical and non-technical, with the ability to explain complex concepts simply
  • Ability to work in a fast-paced, research-driven environment with a high degree of autonomy
  • Strong analytical and problem-solving skills, with the ability to learn new technical domains quickly

Nice to Have

  • Experience working with AI models, particularly large language models
  • Knowledge of data analysis and machine learning techniques
  • Familiarity with Agile development methodologies and version control systems like Git
  • Experience with cloud-based infrastructure and containerization technologies like Docker

Benefits and Perks

  • Competitive compensation package
  • Relocation assistance to San Francisco, CA
  • 3 days of in-office work per week, with flexible remote work options
  • Access to cutting-edge AI research and development
  • Opportunity to work with a talented team of researchers, engineers, and data scientists
  • Comprehensive health, dental, and vision insurance
  • 401(k) matching program
  • Generous paid time off and holiday schedule
  • Professional development opportunities, including conferences, workshops, and training programs

How to Stand Out

  • Develop a strong understanding of large language models, including their capabilities, limitations, and applications.
  • Showcase your ability to work in a fast-paced, research-driven environment with a high degree of autonomy.
  • Highlight your experience with technical program management, research operations, and data analysis.
  • Be prepared to demonstrate your proficiency in Python, SQL, or similar tools, and your ability to learn new technical domains quickly.
  • Emphasize your excellent communication skills, both technical and non-technical, and your ability to explain complex concepts simply.
  • Consider creating a portfolio that showcases your experience with AI models, data analysis, and machine learning techniques.
  • Be prepared to discuss your approach to problem-solving, and your ability to work in a collaborative, cross-functional team environment.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.