Senior Site Reliability Engineer

AirbyteAirbyte·Remote(San Francisco)
Software Development

WFA Digital Insight

The demand for skilled site reliability engineers has skyrocketed, with the industry experiencing a 27% growth in the last year alone. Airbyte, a pioneer in open-source data movement, is at the forefront of this trend. With its innovative approach to data integration, Airbyte is poised to revolutionize the way companies handle data. As a senior site reliability engineer, you'll play a crucial role in shaping the future of data infrastructure. With the rise of AI and machine learning, the ability to manage and optimize complex systems is more critical than ever. Before applying, candidates should be prepared to demonstrate their expertise in Kubernetes, Terraform, and AI-powered tooling.

Job Description

About the Role

The Senior Site Reliability Engineer will be responsible for building and maintaining the infrastructure underpinning Airbyte's Data Replication platform. This is a critical role that requires a deep understanding of cloud-based infrastructure, Kubernetes, and Terraform. As a member of the Data Replication team, you will work closely with product engineers to ensure seamless integration of product features with infrastructure.

The successful candidate will have a strong background in infrastructure engineering, with a focus on reliability, scalability, and security. You will be responsible for setting the infrastructure bar for the team, building self-serve tooling, and coaching engineers to own more of their stack.

Airbyte is committed to innovation and experimentation, and we're looking for someone who shares this vision. Our company culture values trust, directness, and craftsmanship, and we're looking for someone who embodies these principles.

What You Will Do

  • Own the infrastructure underpinning the Data Replication platform, including Kubernetes clusters, CI/CD pipelines, secrets management, networking, and cloud resource configuration across AWS and GCP
  • Partner with product engineers to reliably integrate product features with infrastructure
  • Maintain and enhance observability, alerting, and anomaly detection with an eye towards LLM automation
  • Maintain and enhance AI-augmented release and internal tooling, including canary deployments, progressive rollouts, automated release qualification, and rollback automation
  • Set the infrastructure bar for the team, building self-serve tooling, writing runbooks, and coaching engineers to own more of their stack
  • Collaborate with cross-functional teams to identify and prioritize infrastructure needs
  • Develop and maintain technical documentation for infrastructure components
  • Participate in on-call rotations and respond to incidents as needed
  • Identify areas for improvement and implement changes to increase efficiency and reliability

What We Are Looking For

  • 7+ years of experience in infrastructure, platform engineering, SRE, or DevOps
  • Hands-on ownership of Kubernetes, Helm, and Terraform in production environments
  • Deep experience with observability stacks, including Prometheus, Grafana, and Datadog
  • Experience with CI/CD pipeline ownership and developer tooling
  • Ability and willingness to read backend code to understand how systems break and instrument them correctly
  • Fluency with AI tools, including LLMs and agentic frameworks
  • A startup-ready mindset, comfortable with ambiguity, moving fast, and owning problems end-to-end
  • Strong communication and collaboration skills
  • Experience with cloud-based infrastructure, including AWS and GCP
  • Strong understanding of security and compliance principles

Nice to Have

  • Experience with data pipelines, replication systems, or ETL/ELT platforms
  • Control plane/data plane architectures or internal developer platforms
  • Experience with Airbyte, CDKs, or connector-based architectures
  • Experience with agile development methodologies

Benefits and Perks

  • Competitive salary and equity package
  • Comprehensive health insurance, including medical, dental, and vision
  • Flexible PTO policy and paid holidays
  • Remote work stipend and equipment budget
  • Professional development opportunities, including conferences and training
  • Access to cutting-edge technology and tools
  • Collaborative and dynamic work environment

How to Stand Out

  • Tip: Make sure to highlight your experience with Kubernetes, Terraform, and AI-powered tooling in your resume and cover letter.
  • Tip: Be prepared to discuss your approach to infrastructure engineering, including your experience with cloud-based infrastructure and CI/CD pipelines.
  • Tip: Showcasing your ability to write clean, efficient code and your experience with observability stacks will be a major plus.
  • Tip: Demonstrate your understanding of security and compliance principles, and be prepared to discuss your experience with security audits and compliance frameworks.
  • Tip: Airbyte values innovation and experimentation, so be prepared to discuss your experience with new technologies and your willingness to take calculated risks.
  • Tip: Show enthusiasm for Airbyte's mission and values, and be prepared to discuss how your skills and experience align with the company's goals.
  • Tip: Be prepared to provide specific examples of your experience with infrastructure engineering, and be ready to discuss your approach to problem-solving and collaboration.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.