Senior Software Engineer, AI Evals
Software Development
WFA Digital Insight
As demand for AI-powered solutions grows, companies like Sentry are on the hunt for skilled engineers who can build robust evaluation frameworks. With the rise of hybrid work models, having digital skills and remote work experience is more valuable than ever. Sentry's commitment to performance and error monitoring makes this role stand out in the current job market, where expertise in machine learning and software development is highly sought after.
Job Description
About the Role
Sentry is seeking a Senior Software Engineer to join their AI/ML team in building the evaluation infrastructure for their AI systems. This role is crucial for ensuring the accuracy, reliability, and real-world performance of their debugging agents and AI-powered features.Responsibilities
- Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems
- Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data
- Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows
- Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria
Requirements
- Minimum 5+ years of professional experience with a Bachelor’s degree in computer science, machine learning, or a related field
- Experience building testing, evaluation, and measurement frameworks for AI systems
How to Stand Out
- Be prepared to showcase your experience with machine learning and software development, highlighting specific projects where you've built evaluation frameworks.
- Develop a strong understanding of Sentry's products and how your skills align with their mission.
- Practice explaining complex technical concepts in simple terms, as this is a key aspect of partnering with cross-functional teams.
- Tailor your resume and cover letter to emphasize your expertise in AI system evaluation and your ability to work in a hybrid environment.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.