Machine Learning Engineer, Distributed Data Systems - Robotics

OpenaiOpenai·Remote(San Francisco)
AI & Machine Learning
Excel

WFA Digital Insight

As demand for AI and machine learning specialists continues to soar, with a 25% increase in job postings in the last year alone, professionals with expertise in distributed systems are in high demand. Openai, a pioneer in AI research and deployment, is seeking a skilled Machine Learning Engineer to drive the development of its multimodal capabilities. With the company's commitment to ensuring AI benefits all of humanity, this role offers a unique opportunity to work on cutting-edge technology with a talented team. Candidates should be prepared to showcase their experience with distributed systems, software engineering fundamentals, and a passion for data-driven innovation.

Job Description

About the Role

The Machine Learning Engineer, Distributed Data Systems, will play a critical role in designing and scaling the infrastructure that powers large-scale multimodal training and evaluation at Openai. As part of the Sora team, a hybrid research and product team, you will collaborate closely with researchers to translate requirements into robust systems, ensuring they are reliable, user-friendly, and aligned with Openai's mission of broad societal benefit.

The Sora team is pioneering multimodal capabilities for Openai's foundation models, and as a key member of this team, you will be responsible for managing distributed data pipelines, hardening pipelines that serve as the backbone for Sora's rapid iteration cycles, and ensuring the data platform can scale by orders of magnitude while remaining reliable and efficient.

What You Will Do

  • Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, and machine learning infrastructure
  • Ensure scalability, reliability, and security of the data infrastructure systems
  • Partner with researchers to deeply understand requirements and translate them into production-ready systems
  • Harden, optimize, and maintain critical data infrastructure systems that power multimodal training and evaluation
  • Collaborate with cross-functional teams to ensure seamless integration of the data infrastructure with other systems
  • Develop and maintain tools and scripts to automate data processing and workflows
  • Troubleshoot and resolve issues with the data infrastructure systems
  • Stay up-to-date with the latest developments in distributed systems and machine learning infrastructure
  • Contribute to the development of best practices and standards for data infrastructure development

What We Are Looking For

  • Strong experience with distributed systems and large-scale infrastructure
  • Strong interest in data and machine learning
  • Detail-oriented and rigorous approach to building and maintaining reliable systems
  • Excellent software engineering fundamentals and organizational skills
  • Comfortable with ambiguity and rapid change
  • Experience with data orchestration, distributed storage, and streaming infrastructure
  • Familiarity with machine learning frameworks and tools
  • Strong communication and collaboration skills
  • Ability to work in a fast-paced environment and prioritize tasks effectively

Nice to Have

  • Experience with cloud-based infrastructure and containerization
  • Familiarity with agile development methodologies and version control systems
  • Knowledge of data security and compliance principles

Benefits and Perks

  • Competitive compensation package
  • Opportunity to work on cutting-edge technology with a talented team
  • Collaborative and dynamic work environment
  • Professional development opportunities
  • Flexible work arrangements, including remote work options
  • Access to the latest tools and technologies
  • Comprehensive health and wellness benefits
  • Generous paid time off and vacation policy

How to Stand Out

  • To stand out as a candidate, highlight your experience with distributed systems and large-scale infrastructure, as well as your ability to work with ambiguity and rapid change.
  • Make sure to showcase your software engineering fundamentals and organizational skills, and be prepared to provide examples of your experience with data orchestration, distributed storage, and streaming infrastructure.
  • Familiarize yourself with Openai's mission and values, and be prepared to discuss how your skills and experience align with the company's goals.
  • Be prepared to provide examples of your experience with machine learning frameworks and tools, and be ready to discuss your approach to troubleshooting and resolving issues with data infrastructure systems.
  • Don't hesitate to ask about the company culture and team dynamics during the interview process, and be prepared to discuss your expectations for professional development and growth within the role.
  • Consider creating a portfolio or repository of your work to showcase your skills and experience, and be prepared to discuss your approach to data security and compliance principles.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.