Researcher, Agent Post-Training, API & Power-Users

Openai·Remote(San Francisco)

Other

WFA Digital Insight

The demand for specialized AI researchers has surged in recent years, with a focus on creating more sophisticated and reliable models. As the remote job market continues to evolve, companies like Openai are at the forefront of innovation. With the AI market expected to grow exponentially, professionals with expertise in machine learning and natural language processing are in high demand. Openai's commitment to pushing the boundaries of AI capabilities makes this role an exciting opportunity for those looking to make a real impact. Candidates should be prepared to showcase their skills in data analysis, model training, and collaboration.

Job Description

About the Role

The Agent Post-Training team at Openai is responsible for creating the next generation of agents that can operate computers, collaborate with people and other agents, and expand the capabilities of individuals and organizations. As a researcher on this team, you will play a crucial role in improving the capabilities, reliability, and product fit of Openai's agentic models for power users and API developers. Your day-to-day responsibilities will involve designing and running experiments, building evals and training environments, and partnering with API and power-users to identify high-leverage behavior gaps.

The team's work spans multiple areas, including coding, tool use, computer use, multi-agent coordination, long-horizon execution, factuality, instruction following, calibrated reasoning, and taste. You will be working closely with researchers, engineers, data scientists, and product teams to decide which behaviors matter, how to measure them, and how to train them.

What You Will Do

Design and run experiments to improve model behavior in API and power-user workflows, including function calling, tool use, coding, planning, long-horizon execution, factuality, instruction following, error recovery, and calibrated reasoning.
Build evals, graders, and environments from real developer and power-user workflows, and turn observed failures into training data, model-behavior hypotheses, and shipped improvements.
Partner with API and power-users to identify high-leverage behavior gaps and convert product signals into post-training interventions.
Improve how models behave when composed into systems, including using tools reliably, respecting developer intent, handling partial failures, asking for clarification when appropriate, and maintaining coherence across multi-step tasks.
Own end-to-end model behavior projects, from qualitative failure analysis through data generation, training experiments, eval design, integration into major runs, and launch readiness.
Develop feedback loops that use power-user traces, API usage patterns, and production-like environments to discover the next frontier of agentic model failures and gaps.
Help decide which agentic capabilities, behavioral fixes, and partner-team integrations are ready for inclusion in major model runs.
Debug hard failures in shipped or near-shipped models by moving between traces, evals, training data, model outputs, and product context.
Work on early-training and alignment interventions, including data mixtures, objectives, synthetic data, and eval loops that shape downstream agent behavior.
Improve the machinery for large-scale training and launch, including experiment velocity, reliability, observability, reproducibility, and cost-effectiveness.

What We Are Looking For

A strong background in computer science, machine learning, or a related field, with a focus on natural language processing, reinforcement learning, or multi-agent systems.
Experience with Python, TensorFlow, or PyTorch, and a strong understanding of software development principles and engineering practices.
Excellent problem-solving skills, with the ability to analyze complex systems, identify key issues, and develop creative solutions.
Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams, including researchers, engineers, and product managers.
Experience with data analysis, model training, and evaluation, with a strong understanding of statistical concepts and machine learning algorithms.
A passion for AI research and a desire to contribute to the development of next-generation AI models.
Experience with cloud computing platforms, such as AWS or Google Cloud, and a strong understanding of distributed systems and scalability.
Familiarity with Agile development methodologies and version control systems, such as Git.

Nice to Have

Experience with other programming languages, such as Java or C++.
Familiarity with containerization tools, such as Docker, and orchestration tools, such as Kubernetes.
Experience with data visualization tools, such as Tableau or Matplotlib, and a strong understanding of data storytelling principles.
A strong background in mathematics, with a focus on linear algebra, calculus, and probability theory.
Experience with open-source software development and a strong understanding of open-source principles and practices.

Benefits and Perks

Competitive salary and equity package.
Comprehensive health insurance, including medical, dental, and vision coverage.
Flexible PTO policy, with a minimum of 4 weeks of paid time off per year.
Remote work stipend, with a budget for home office setup and equipment.
Access to cutting-edge technology and tools, including cloud computing platforms and machine learning frameworks.
Opportunities for professional development and growth, including training programs and conference sponsorships.
Collaborative and dynamic work environment, with a team of experienced researchers and engineers.

How to Stand Out

To stand out as a candidate, be prepared to showcase your skills in data analysis, model training, and collaboration, with a strong understanding of statistical concepts and machine learning algorithms.
Develop a portfolio of projects that demonstrate your expertise in AI research, including experience with Python, TensorFlow, or PyTorch, and a strong understanding of software development principles and engineering practices.
When applying, highlight your ability to work effectively with cross-functional teams, including researchers, engineers, and product managers, and demonstrate your passion for AI research and your desire to contribute to the development of next-generation AI models.
Be prepared to discuss your experience with data visualization tools and your understanding of data storytelling principles, as well as your familiarity with Agile development methodologies and version control systems, such as Git.
When negotiating salary, be sure to research the market rate for AI researchers in your area and be prepared to discuss your skills and experience in relation to the company's needs and goals.
Look out for red flags, such as a lack of transparency about the company's goals or values, or a lack of opportunities for professional development and growth.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.