Staff Machine Learning Engineer, AI Serving
WFA Digital Insight
The remote job market for machine learning engineers is booming, with demand growing by over 25% in the past year alone. Reddit's commitment to innovation and community-driven approach makes this role particularly appealing. As a staff machine learning engineer, you'll have the opportunity to work on high-impact projects, leveraging skills in cloud-based technologies, model serving, and inference pipelines. With the rise of AI and ML, professionals with expertise in these areas are in high demand, and companies like Reddit are looking for top talent to drive their growth. Before applying, candidates should be prepared to showcase their experience in ML engineering, cloud deployment, and strong communication skills.
Job Description
About the Role
As a staff machine learning engineer at Reddit, you will be leading the development of a large-scale ML inference platform, working closely with cross-functional teams to drive innovation and growth. Your expertise in machine learning, cloud computing, and Kubernetes will be crucial in designing and implementing a highly available, low-latency GPU-based model serving system. Reddit's machine learning platform team is a high-impact team that owns the infrastructure powering recommendations, content discovery, and user quantification, directly impacting teams such as Growth, Ads, Feeds, and Core Machine Learning.The role entails working on complex projects, leading the end-to-end design, implementation, and maintenance of ML systems, and collaborating with other teams to ensure seamless integration. Your day-to-day responsibilities will include designing and developing ML and generative AI systems in cloud-based production environments, rapidly developing prototypes, and leading a unified GPU model export framework.
What You Will Do
- Lead the development of a large-scale ML inference platform at Reddit
- Design and implement a highly available, low-latency GPU-based model serving system for search, ranking, and LLMs supporting millions of QPS
- Develop ML and generative AI systems in cloud-based production environments on Kubernetes at scale
- Rapidly develop prototypes and develop a high-performance feature hydration and processing system as part of the inference stack
- Lead a unified GPU model export framework to support converting trained models into optimized GPU inference models
- Collaborate with other teams to ensure seamless integration of ML systems
- Participate in the design and development of ML and generative AI systems
- Work closely with the engineering team to identify and prioritize project requirements
- Develop and maintain technical documentation for ML systems
- Participate in code reviews and ensure high-quality code
What We Are Looking For
- 7+ years of experience in ML engineering, AI platform engineering, or cloud AI deployment roles
- Experience operating orchestration systems such as Kubernetes at scale
- Deep experience with cloud-based technologies for supporting an ML platform, including tools like AWS, Google Cloud Storage, infrastructure-as-code (Terraform), and more
- Proficiency with common programming languages and frameworks of ML, such as Go, Python, etc.
- Excellent communication skills with the ability to articulate technical AI concepts to non-technical stakeholders
- Strong focus on scalability, reliability, performance, and ease of use
- Strong knowledge of model serving, inference pipelines, monitoring, and observability for AI systems
- Strong proficiency in Python and deep experience with modern AI/ML frameworks (Triton, Dynamo, vLLM, Pytorch)
Nice to Have
- Experience with LLM serving online at scale
- Built an E2E inference performance benchmarking framework
- Deep understanding of multi-cluster compute environment and network topology specific to ML inference use cases
- Experience with Excel and data analysis
Benefits and Perks
- Comprehensive healthcare benefits and income replacement programs
- 401k with employer match
- Global benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
- Family planning support
- Gender-affirming benefits
- Remote work stipend and flexible working hours
- Opportunities for professional growth and development
- Access to cutting-edge technologies and tools
- Collaborative and dynamic work environment
How to Stand Out
- Highlight your experience in ML engineering, cloud deployment, and Kubernetes in your resume and cover letter.
- Prepare to showcase your proficiency in programming languages such as Python and Go, and experience with modern AI/ML frameworks.
- Be prepared to discuss your experience with model serving, inference pipelines, and monitoring, and how you've applied these skills in previous roles.
- Showcasing your ability to communicate complex technical concepts to non-technical stakeholders is crucial, so be prepared to provide examples.
- Research Reddit's company culture and values, and be prepared to discuss how your skills and experience align with these.
- When negotiating salary, consider the company's overall compensation package, including benefits and perks, and be prepared to discuss your expectations.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.