Staff Machine Learning Engineer, AI Serving

Reddit·Remote(Remote - United States)

AI & Machine Learning

Excel

WFA Digital Insight

The remote job market for machine learning engineers is booming, with demand growing by over 25% in the past year alone. Reddit's commitment to innovation and community-driven approach makes this role particularly appealing. As a staff machine learning engineer, you'll have the opportunity to work on high-impact projects, leveraging skills in cloud-based technologies, model serving, and inference pipelines. With the rise of AI and ML, professionals with expertise in these areas are in high demand, and companies like Reddit are looking for top talent to drive their growth. Before applying, candidates should be prepared to showcase their experience in ML engineering, cloud deployment, and strong communication skills.

Job Description

About the Role

As a staff machine learning engineer at Reddit, you will be leading the development of a large-scale ML inference platform, working closely with cross-functional teams to drive innovation and growth. Your expertise in machine learning, cloud computing, and Kubernetes will be crucial in designing and implementing a highly available, low-latency GPU-based model serving system. Reddit's machine learning platform team is a high-impact team that owns the infrastructure powering recommendations, content discovery, and user quantification, directly impacting teams such as Growth, Ads, Feeds, and Core Machine Learning.

The role entails working on complex projects, leading the end-to-end design, implementation, and maintenance of ML systems, and collaborating with other teams to ensure seamless integration. Your day-to-day responsibilities will include designing and developing ML and generative AI systems in cloud-based production environments, rapidly developing prototypes, and leading a unified GPU model export framework.

What You Will Do

Lead the development of a large-scale ML inference platform at Reddit
Design and implement a highly available, low-latency GPU-based model serving system for search, ranking, and LLMs supporting millions of QPS
Develop ML and generative AI systems in cloud-based production environments on Kubernetes at scale
Rapidly develop prototypes and develop a high-performance feature hydration and processing system as part of the inference stack
Lead a unified GPU model export framework to support converting trained models into optimized GPU inference models
Collaborate with other teams to ensure seamless integration of ML systems
Participate in the design and development of ML and generative AI systems
Work closely with the engineering team to identify and prioritize project requirements
Develop and maintain technical documentation for ML systems
Participate in code reviews and ensure high-quality code

What We Are Looking For

7+ years of experience in ML engineering, AI platform engineering, or cloud AI deployment roles
Experience operating orchestration systems such as Kubernetes at scale
Deep experience with cloud-based technologies for supporting an ML platform, including tools like AWS, Google Cloud Storage, infrastructure-as-code (Terraform), and more
Proficiency with common programming languages and frameworks of ML, such as Go, Python, etc.
Excellent communication skills with the ability to articulate technical AI concepts to non-technical stakeholders
Strong focus on scalability, reliability, performance, and ease of use
Strong knowledge of model serving, inference pipelines, monitoring, and observability for AI systems
Strong proficiency in Python and deep experience with modern AI/ML frameworks (Triton, Dynamo, vLLM, Pytorch)

Nice to Have

Experience with LLM serving online at scale
Built an E2E inference performance benchmarking framework
Deep understanding of multi-cluster compute environment and network topology specific to ML inference use cases
Experience with Excel and data analysis

Benefits and Perks

Comprehensive healthcare benefits and income replacement programs
401k with employer match
Global benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
Family planning support
Gender-affirming benefits
Remote work stipend and flexible working hours
Opportunities for professional growth and development
Access to cutting-edge technologies and tools
Collaborative and dynamic work environment

How to Stand Out

Highlight your experience in ML engineering, cloud deployment, and Kubernetes in your resume and cover letter.
Prepare to showcase your proficiency in programming languages such as Python and Go, and experience with modern AI/ML frameworks.
Be prepared to discuss your experience with model serving, inference pipelines, and monitoring, and how you've applied these skills in previous roles.
Showcasing your ability to communicate complex technical concepts to non-technical stakeholders is crucial, so be prepared to provide examples.
Research Reddit's company culture and values, and be prepared to discuss how your skills and experience align with these.
When negotiating salary, consider the company's overall compensation package, including benefits and perks, and be prepared to discuss your expectations.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.