Workload Porting & Performance Engineer
WFA Digital Insight
As the demand for AI and machine learning specialists continues to soar, with a 25% increase in job postings in the last year alone, the role of Workload Porting & Performance Engineer has become a critical component in ensuring the seamless integration of new hardware platforms. With the rapid evolution of technology, companies like Openai are at the forefront, pushing the boundaries of AI capabilities. This role stands out due to its focus on optimizing workloads for maximum performance, a skill set that is increasingly valuable in the current remote job market. Candidates should be aware of the complex interplay between hardware and software and have a deep understanding of system architecture. Before applying, it's essential to consider the unique challenges and opportunities presented by working in a hybrid remote environment and the importance of collaborative work in driving innovation.
Job Description
About the Role
The Workload Porting & Performance Engineer role at Openai is a unique opportunity to work at the intersection of hardware and software, ensuring that new platforms deliver real-world performance aligned with workload needs. This role is part of the Infrastructure organization, which builds and evaluates the systems that power advanced AI workloads. The team works closely with hardware, modeling, and architecture teams to bridge the gap between theoretical capability and observed system performance.Day-to-day, the successful candidate will be evaluating new hardware platforms by porting benchmarks and real-world workloads, analyzing performance, and identifying system bottlenecks. This critical role requires strong hands-on experience with performance analysis, workload optimization, and system-level debugging across hardware and software boundaries.
The Infrastructure organization at Openai is dedicated to creating the systems that will drive the next generation of AI technology. By joining this team, the Workload Porting & Performance Engineer will play a critical role in shaping the future of AI and ensuring that Openai's products continue to push the boundaries of what is possible.
What You Will Do
- Port and enable benchmarks and real-world workloads on new hardware platforms to evaluate system performance.
- Evaluate system performance across compute, memory, storage, and networking subsystems to identify bottlenecks.
- Develop and run performance experiments and profiling workflows to understand workload behavior.
- Compare expected vs. observed performance and provide feedback to hardware architecture teams, performance modeling teams, and system and software engineers.
- Debug issues across the stack, including software, runtime, and hardware interactions.
- Identify root causes of performance issues across hardware/software boundaries.
- Adapt and optimize workloads to better utilize hardware capabilities.
- Develop actionable insights to guide platform readiness and deployment decisions.
- Collaborate with cross-functional teams to ensure that performance aligns with expectations.
What We Are Looking For
- Experience with performance analysis, benchmarking, or workload optimization.
- Strong understanding of system architecture, including CPU/GPU, memory, and I/O subsystems.
- Experience porting or adapting workloads across different hardware platforms.
- Familiarity with profiling tools and performance debugging techniques.
- Ability to identify root causes of performance issues across hardware/software boundaries.
- Experience working in large-scale or distributed system environments.
- Strong problem-solving skills and attention to detail.
- Excellent communication and collaboration skills.
- Ability to work in a fast-paced, dynamic environment.
Nice to Have
- Experience with AI/ML workloads, including training or inference systems.
- Familiarity with GPU or accelerator-based systems.
- Experience working with low-level performance tools (profilers, tracing, microbenchmarks).
- Background in systems software, compilers, or runtime optimization.
- Experience collaborating with hardware and architecture teams on performance validation.
Benefits and Perks
- Opportunity to work on cutting-edge AI technology and contribute to the development of new hardware platforms.
- Collaborative and dynamic work environment with a team of experts in their fields.
- Hybrid remote work model with 3 days in the office per week.
- Relocation assistance for candidates moving to San Francisco.
- Access to the latest tools and technologies.
- Professional development opportunities to continue growing your skills and career.
- Comprehensive health insurance and other benefits.
- Generous PTO policy to ensure a healthy work-life balance.
How to Stand Out
- To stand out, highlight specific experiences where you have successfully optimized workloads for performance on new hardware platforms. Include metrics or benchmarks that demonstrate the impact of your work.
- Familiarize yourself with Openai's current projects and initiatives, and be prepared to discuss how your skills and experience align with their goals.
- Practice your system architecture knowledge, as this will be a critical component of the interview process.
- Prepare examples of times when you had to debug complex performance issues across hardware and software boundaries, and walk the interviewer through your thought process.
- Consider creating a portfolio or repository of your work that demonstrates your proficiency in performance analysis and workload optimization.
- Be prepared to discuss your experience with remote work and how you stay productive in a hybrid environment.
- When negotiating salary, consider the value of the relocation assistance and other benefits offered by the company.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.