Tech Lead, Deployment & Operations — Custom Infrastructure

Openai·Remote(San Francisco)

Software Development

Excel

WFA Digital Insight

The demand for tech leaders in AI infrastructure has surged, with a 25% growth in 2025. Openai's custom silicon and systems team is at the forefront of this wave. With the rise of AI-native silicon, professionals with expertise in deployment and operations are in high demand. Openai stands out for its commitment to co-designing hardware and software, offering a unique opportunity for tech leaders to drive innovation. Before applying, candidates should be aware of the need for strong technical leadership and cross-functional collaboration in this role.

Job Description

About the Role

The Tech Lead, Deployment & Operations, will play a critical role in bringing Openai's custom silicon and systems into production data center environments. This involves leading a team responsible for the deployment and operations of these systems, ensuring successful bring-up, validation, and operational readiness. The role sits at the intersection of silicon, systems, infrastructure, data center operations, and software, requiring a deep understanding of these domains.

As a tech lead, you will be responsible for building the operational processes, technical workflows, tooling, and cross-functional alignment required to deploy and operate custom AI hardware reliably. This includes defining deployment processes, operational playbooks, and technical readiness criteria, as well as driving cross-functional execution across lab bring-up, rack/system integration, and data center deployment.

The role is part of Openai's Hardware organization, which develops silicon and system-level solutions for advanced AI workloads. The team is responsible for building the next generation of AI-native silicon while working closely with software and research partners to co-design hardware tightly integrated with AI models.

What You Will Do

Lead a team responsible for deployment and operations of custom silicon and systems in data center environments
Own the path from hardware bring-up and validation through production deployment, operational readiness, and sustained fleet support
Partner closely with silicon, systems, software, infrastructure, networking, data center, supply chain, and external partner teams to ensure successful deployment at scale
Define deployment processes, operational playbooks, technical readiness criteria, escalation paths, and reliability practices for new hardware platforms
Drive cross-functional execution across lab bring-up, rack/system integration, data center deployment, fleet monitoring, debugging, and issue resolution
Stay hands-on technically through architecture reviews, deployment planning, failure analysis, operational debugging, and critical system-level decision-making
Identify gaps in tooling, observability, automation, validation coverage, and operational processes, and build plans to close them
Establish clear metrics for deployment readiness, reliability, performance, maintainability, and operational health
Build a strong engineering culture grounded in ownership, technical rigor, operational excellence, and high-velocity execution

What We Are Looking For

Strong technical leadership and experience in deployment and operations of custom hardware systems
Expertise in silicon, systems, infrastructure, data center operations, and software
Ability to drive cross-functional collaboration and execution
Experience with Excel and other relevant tools
Strong judgment around deployment sequencing, technical risk, operational readiness, and escalation
Excellent communication and technical skills
Ability to operate in ambiguous, fast-moving environments

Nice to Have

Experience with AI-native silicon and systems
Knowledge of co-design principles and practices
Familiarity with data center operations and infrastructure

Benefits and Perks

Opportunity to work on cutting-edge AI hardware and systems
Collaborative and dynamic work environment
Professional development and growth opportunities
Competitive compensation and benefits package
Remote work options and flexible scheduling
Access to cutting-edge technologies and tools
Opportunity to contribute to the development of AI-native silicon and systems

How to Stand Out

To stand out, highlight your experience with custom hardware systems and deployment operations in your application.
Be prepared to discuss your technical expertise in silicon, systems, and infrastructure during the interview process.
Show a willingness to learn and adapt to new technologies and processes, as the field of AI-native silicon is rapidly evolving.
Emphasize your ability to drive cross-functional collaboration and execution, as this is critical to success in this role.
Prepare to discuss your experience with Excel and other relevant tools, and be ready to provide examples of how you have used them in previous roles.
Research Openai's culture and values, and be prepared to discuss how you align with them.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.