Senior Software Engineer, AI Evals

Sentry·Remote(San Francisco, California)

Software Development

WFA Digital Insight

As demand for AI-powered solutions grows, companies like Sentry are on the hunt for skilled engineers who can build robust evaluation frameworks. With the rise of hybrid work models, having digital skills and remote work experience is more valuable than ever. Sentry's commitment to performance and error monitoring makes this role stand out in the current job market, where expertise in machine learning and software development is highly sought after.

Job Description

About the Role

Sentry is seeking a Senior Software Engineer to join their AI/ML team in building the evaluation infrastructure for their AI systems. This role is crucial for ensuring the accuracy, reliability, and real-world performance of their debugging agents and AI-powered features.

Responsibilities

Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems
Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data
Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows
Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria

Requirements

Minimum 5+ years of professional experience with a Bachelor’s degree in computer science, machine learning, or a related field
Experience building testing, evaluation, and measurement frameworks for AI systems

How to Stand Out

Be prepared to showcase your experience with machine learning and software development, highlighting specific projects where you've built evaluation frameworks.
Develop a strong understanding of Sentry's products and how your skills align with their mission.
Practice explaining complex technical concepts in simple terms, as this is a key aspect of partnering with cross-functional teams.
Tailor your resume and cover letter to emphasize your expertise in AI system evaluation and your ability to work in a hybrid environment.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.