Staff Software Engineer, Inference Infrastructure

CohereCohere·Remote(San Francisco)
Software Development
Excel

WFA Digital Insight

The demand for skilled software engineers in AI infrastructure has grown significantly, with a 25% increase in job postings over the past year. As companies like Cohere continue to push the boundaries of AI capabilities, the need for experts in distributed systems, Kubernetes, and GPU workloads has become crucial. With the rise of remote work, candidates now have more opportunities to join innovative companies like Cohere, which boasts a diverse range of perspectives and a culture of collaboration. Before applying, candidates should be aware of the high expectations for technical expertise and the need for adaptability in a rapidly evolving field.

Job Description

About the Role

As a Staff Software Engineer for Inference Infrastructure at Cohere, you will play a critical role in developing and deploying the company's large language models through easy-to-use API endpoints. You will work closely with various teams to ensure the deployment of optimized NLP models to production in low-latency, high-throughput, and high-availability environments. The Model Serving team, which you will be a part of, is responsible for building and maintaining the AI platform that powers Cohere's cutting-edge NLP applications.

The role requires a deep understanding of distributed systems, Kubernetes, and GPU workloads, as well as the ability to collaborate effectively with cross-functional teams. You will have the opportunity to interface with customers and create customized deployments to meet their specific needs, making this role both technically challenging and customer-facing.

Cohere's mission to scale intelligence to serve humanity is at the forefront of everything they do. The company is committed to building a diverse and inclusive work environment, where every team member is empowered to contribute to the development of innovative AI solutions.

What You Will Do

  • Design, develop, and deploy large-scale distributed systems with a focus on high availability and scalability
  • Collaborate with the Model Serving team to build and maintain the AI platform delivering Cohere's large language models
  • Work closely with customers to create customized deployments that meet their specific needs
  • Ensure the smooth operation of Cohere's AI platform, troubleshooting issues and optimizing performance as needed
  • Develop and maintain tools and scripts to automate deployment, monitoring, and maintenance tasks
  • Participate in code reviews and contribute to the improvement of the overall code quality
  • Stay up-to-date with the latest advancements in AI, NLP, and distributed systems, applying this knowledge to improve Cohere's products and services
  • Collaborate with the research team to integrate new models and techniques into the production environment
  • Develop and maintain documentation for the AI platform, ensuring that knowledge is shared across teams

What We Are Looking For

  • 5+ years of experience in software engineering, with a focus on distributed systems, Kubernetes, and GPU workloads
  • Experience designing and deploying large-scale AI platforms, preferably with expertise in NLP models
  • Strong understanding of computational characteristics of accelerators (GPUs, TPUs, and/or custom accelerators)
  • Excellent collaboration and troubleshooting skills, with the ability to work effectively in a team environment
  • Experience with GCP, Azure, AWS, OCI, multi-cloud on-prem/hybrid serving, and containerization using Docker
  • Strong programming skills in languages such as Golang, C++, or other languages designed for high-performance scalable servers
  • Experience with agile development methodologies and version control systems like Git
  • Strong understanding of compute, storage, and network resource and cost management

Nice to Have

  • Experience with Kubernetes dev and production coding and support
  • Familiarity with Linux-based computing environments and experience with designing, deploying, supporting, and troubleshooting in such environments
  • Knowledge of cloud-native technologies and experience with serverless computing
  • Experience with continuous integration and continuous deployment (CI/CD) pipelines

Benefits and Perks

  • Competitive salary and benefits package
  • Opportunities for career growth and professional development in a rapidly evolving field
  • Collaborative and dynamic work environment with a team of highly skilled professionals
  • Flexible working hours and remote work options
  • Access to cutting-edge technologies and tools
  • Comprehensive health and dental benefits, including mental health support
  • Generous parental leave policy and family-friendly benefits
  • Professional enrichment benefits, including support for conference attendance and continuing education
  • Co-working stipend and support for remote work setup

How to Stand Out

  • Tip: Make sure your resume and cover letter highlight your experience with distributed systems, Kubernetes, and GPU workloads, as these are key requirements for the role.
  • Familiarize yourself with Cohere's products and services, and be prepared to discuss how your skills and experience align with the company's mission and goals.
  • Showcase your ability to collaborate effectively with cross-functional teams, and provide examples of times when you have worked with customers to deliver customized solutions.
  • Be prepared to discuss your experience with agile development methodologies and version control systems like Git, and how you have applied these in previous roles.
  • Consider creating a portfolio or GitHub repository that demonstrates your programming skills and experience with languages like Golang or C++.
  • During the interview process, ask questions about the company culture, team dynamics, and opportunities for growth and professional development to demonstrate your interest in the role and the company.
  • Be honest and transparent about your experience and skills, and don't be afraid to ask for clarification or more information about the role or the company.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.