Senior Site Reliability Engineer

AffirmAffirm·Remote(Remote Poland)
Software Development
Excel

WFA Digital Insight

As demand for cloud computing specialists surges, Affirm's Senior Site Reliability Engineer role stands out in the remote job market. With a focus on automation and Kubernetes, this position requires a unique blend of technical expertise and collaboration skills. As the digital payments landscape evolves, candidates should be prepared to drive initiatives that enhance observability and reliability.

Job Description

About the Role

Affirm is seeking a Senior Site Reliability Engineer to play a pivotal role in ensuring the robust and scalable foundation of their entire platform. The Cloud Compute team manages all of Affirm's Kubernetes clusters, providing a highly reliable and available cloud environment that empowers engineering teams to build and deploy innovative solutions seamlessly.

Responsibilities

  • Execute on the technical strategy for the team on a year-long time scale and help tie it together with critical, business-impacting projects.
  • Collaborate across teams in the product development lifecycle to ensure technical sustainability, risks, and trade-offs are well understood and managed.
  • Act as a force-multiplier for the team through the definition and advocacy of technical solutions and operational processes.
  • Take ownership of the team’s operations and availability by ensuring the right monitoring, triage rotations, playbooks, policies, testing, and alerting are in place to support “keep the lights on” & on-call efforts.

How to Stand Out

  • Familiarize yourself with Kubernetes and cloud infrastructure management to stand out in the application process.
  • Highlight your experience with automation tools and technologies to demonstrate your ability to drive operational excellence.
  • Prepare to discuss your approach to collaboration and communication in a remote team setting.
  • Showcase your problem-solving skills through specific examples of technical challenges you've overcome in previous roles.
  • Be ready to discuss your experience with monitoring, triage, and incident response to demonstrate your ability to manage complex systems.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.