Networking Operating System Firmware Engineer
WFA Digital Insight
The demand for skilled networking professionals is on the rise, with a 25% increase in job postings over the past year. As a Networking Operating System Firmware Engineer at Openai, you'll play a critical role in developing the next generation of AI-native silicon. With the global AI market expected to reach
Job Description
About the Role
As a Networking Operating System Firmware Engineer at Openai, you will be responsible for building and maintaining custom NOS images for large-scale AI fabrics. This involves working with open-source components from SONiC, FRR, and related networking stacks to design, develop, and test production-grade NOS software. You will collaborate closely with cross-functional teams, including hardware, software, and research partners, to co-design hardware tightly integrated with AI models.The role requires a deep understanding of networking, NOS internals, switch hardware, and production systems. You will work on integrating, building, and configuring Linux kernel components, device drivers, switch ASIC SDKs, and SAI layers. Your expertise in debugging complex issues spanning kernel drivers, platform monitoring, NOS services, routing agents, orchestration services, hardware signals, ASIC state, and network topology will be invaluable to the team.
Openai's Hardware organization is committed to delivering production-grade silicon for its supercomputing infrastructure. As a member of this team, you will contribute to the development of custom design tools and methodologies that accelerate innovation and enable hardware optimized specifically for AI.
What You Will Do
- Design, develop, and maintain custom NOS images for large-scale AI fabrics using open-source components from SONiC, FRR, and related networking stacks
- Integrate, build, and configure Linux kernel components, device drivers, switch ASIC SDKs, and SAI layers
- Bring up new switch platforms, including thermal and fan control, power monitoring, transceiver management, watchdogs, OSFP CMIS, LEDs, CPLDs, and board-specific platform logic
- Extend and customize NOS services for routing, telemetry, control-plane state, and distributed automation
- Implement and debug route, neighbor, next-hop, and ECMP programming flows from control-plane intent through ASIC hardware state
- Build software mechanisms that distinguish control-plane acceptance, SAI/SDK acceptance, and explicit hardware programming acknowledgement
- Work with hardware teams to validate ASIC configurations, link bring-up, SerDes tuning, buffer profiles, and performance baselines
- Evaluate switch silicon SDK releases, track vendor deliverables, and validate platform requirements with vendors and ASIC partners
- Debug complex issues spanning kernel drivers, platform monitoring, NOS services, routing agents, orchestration services, hardware signals, ASIC state, and network topology
- Integrate switches into fleet-wide monitoring, remote diagnostics, telemetry pipelines, and automated lifecycle workflows
- Develop robust CI/build pipelines for reproducible NOS builds and controlled rollout across the fleet
What We Are Looking For
- Proven experience working with SONiC or comparable NOS stacks such as FBOSS, Cumulus Linux, Arista EOS, Junos PFE-level integration, or equivalent platform software
- Strong software engineering fundamentals, including clear interfaces, data models, state-machine design, error handling, testing, observability, performance debugging, and maintainable C/C++, Python, Go, or Rust code
- Experience with Linux kernel internals, network device drivers, platform drivers, hwmon, I2C/SMBus, CPLDs, or board-level platform software
- Experience integrating or debugging Broadcom, Marvell, or other networking ASICs
- Strong understanding of networking protocols, including TCP/IP, BGP, OSPF, and MPLS
- Ability to work through ambiguous, open-ended technical problems and drive feature development across software, hardware, and vendor boundaries
Nice to Have
- Experience with cloud-based infrastructure, including AWS, GCP, or Azure
- Knowledge of containerization using Docker or Kubernetes
- Familiarity with agile development methodologies, including Scrum or Kanban
- Experience with continuous integration and continuous deployment (CI/CD) pipelines
Benefits and Perks
- Competitive salary and equity package
- Comprehensive health, dental, and vision insurance
- Flexible PTO and paid holidays
- Remote work stipend and home office setup support
- Professional development opportunities, including conferences, training, and workshops
- Access to cutting-edge technology and tools
- Collaborative and dynamic work environment
How to Stand Out
- Tip: Showcase your experience with open-source networking stacks, such as SONiC, and highlight your ability to work with Linux kernel internals.
- Be prepared to discuss your approach to debugging complex technical issues and your experience with ASIC programming.
- A strong understanding of networking protocols, including TCP/IP, BGP, OSPF, and MPLS, is essential for this role.
- When applying, include examples of your work with CI/CD pipelines and your experience with cloud-based infrastructure.
- Be prepared to talk about your experience working with cross-functional teams and your ability to communicate technical concepts to non-technical stakeholders.
- Make sure to research Openai's company culture and values, and be prepared to discuss how your skills and experience align with the company's mission and goals.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.