Research Engineer, Frontier Evals & Environments - Finance

Openai·Remote(San Francisco)

Software Development

Excel

WFA Digital Insight

The demand for skilled research engineers in AI has grown exponentially, with a notable 25% increase in job postings over the past year. As companies like Openai continue to push the boundaries of artificial intelligence, the need for experts who can design and implement model evaluations has become crucial. With the finance domain being a key area of focus, professionals with strong statistical analysis skills and attention to detail are in high demand. Openai's commitment to ensuring AI benefits humanity makes it an attractive destination for those passionate about safe AI development. Before applying, candidates should be aware of the fast-paced research environment and the need for cross-functional collaboration.

Job Description

About the Role

The Research Engineer position at Openai is a unique opportunity to be at the forefront of artificial intelligence development, specifically within the finance domain. As part of the Frontier Evals team, you will be responsible for designing and implementing model evaluations that drive progress towards safe and beneficial AI. Your work will involve identifying key model capabilities, quantifying performance, and continuously refining evaluations to assess the extent of frontier capabilities.

The Frontier Evals team is known for its ambitious evaluations, including SWE-bench Verified, MLE-bench, PaperBench, and SWE-Lancer, which have been instrumental in steering the development of models like GPT4o, o1, o3, GPT 4.5, ChatGPT Agent, and GPT5. This role offers the chance to work on real-world applications and contribute to the advancement of AI in a field that is increasingly dependent on data-driven insights.

What You Will Do

Identify important model capabilities, skills, and behaviors crucial to financial workflows and design methods to quantify performance in these areas.
Own and pursue a research agenda to identify an important model capability, especially as it relates to financial reasoning, and build evaluations to measure it.
Continuously refine evaluations of frontier AI models to assess the extent of frontier capabilities.
Collaborate with cross-functional teams to integrate evaluations into the model development process.
Analyze data from model evaluations to inform future development and improvement.
Develop and maintain detailed documentation of evaluation methodologies and results.
Present findings and insights to both technical and non-technical stakeholders.
Stay up-to-date with the latest advancements in AI research and evaluate their potential application to financial model evaluations.

What We Are Looking For

Strong engineering and statistical analysis skills, with at least 2-3 years of full-time technical experience.
Passion for evaluations for real-world applications and knowledge work.
Detail-oriented and thorough approach to work.
Ability to operate effectively in a dynamic and fast-paced research environment.
Ability to scope and deliver projects end-to-end.
Excellent communication skills.
Experience working with Excel and other data analysis tools.
Knowledge of AI/ASI measurement principles and practices.
Ability to work cross-functionally and collaborate with diverse teams.

Nice to Have

Experience with programming languages such as Python.
Familiarity with machine learning frameworks and libraries.
Previous work experience in the finance domain or a related field.
Knowledge of cloud computing platforms.

Benefits and Perks

Competitive compensation package.
Opportunity to work on cutting-edge AI projects with real-world impact.
Collaborative and dynamic work environment.
Professional development opportunities.
Access to the latest tools and technologies in AI research.
Flexible work arrangements, including remote work options.
Comprehensive health insurance and other benefits.
Equity in a leading AI research company.

How to Stand Out

Ensure your resume highlights specific experiences with statistical analysis and data visualization tools like Excel.
Prepare to discuss your understanding of AI model evaluations and how they can be applied to financial contexts.
Showcase any personal projects or contributions to open-source evaluations that demonstrate your skills and passion for AI development.
Be ready to explain how you stay current with advancements in AI research and how you see yourself contributing to the field.
Consider creating a portfolio that showcases your ability to design and implement model evaluations, especially in finance or related domains.
Practice explaining complex technical concepts to non-technical audiences, as this is a key skill for presenting findings and insights to stakeholders.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.