Manager, Software Engineering (Reliability Platform)

AffirmAffirm·Remote(Remote US)
Software Development
Excel

WFA Digital Insight

The demand for skilled software engineering leaders in the US remote job market has surged, with a focus on reliability and scalability. As companies like Affirm pioneer new financial technologies, the need for experts who can navigate complex systems and drive innovation has grown. With the remote job market expected to continue its upward trend, having the right skills - such as proficiency in Excel and experience with observability tools - is crucial. Candidates interested in this role should be prepared to showcase their technical leadership skills, ability to drive reliability practices, and experience in building scalable operational tooling. Before applying, it's essential to understand Affirm's commitment to making credit more honest and friendly and how this role contributes to that mission.

Job Description

About the Role

The Manager, Software Engineering (Reliability Platform) at Affirm is a critical role that focuses on ensuring the safety and reliability of production systems. This involves building products and capabilities that drive reliability practices at scale, leading a team of engineers to develop systems that allow for the understanding, prioritization, and reduction of systemic reliability risks. The successful candidate will sit at the intersection of Platform Engineering and Site Reliability Engineering, partnering closely with Infrastructure and Product Engineering teams.

As a leader in this space, you will be responsible for defining reliability standards, building scalable operational tooling, and driving the adoption of engineering best practices across the company. Your team will be pivotal in developing foundational operational intelligence capabilities for Affirm, including the next-generation Reliability and Risk Management platform. This platform combines observability, AI, and operational workflows to help engineering teams proactively manage availability, resiliency, and operational excellence at scale.

The role requires a deep understanding of software engineering principles, a strong background in reliability and scalability, and the ability to lead and manage high-performing teams. You will need to leverage AI-assisted engineering workflows to accelerate platform development, operational automation, and engineering productivity. Your ability to translate ambiguous operational and organizational challenges into clear technical requirements, execution plans, and measurable outcomes will be essential to success.

What You Will Do

  • Build and lead a high-performance product engineering team focused on innovation, accountability, reliability, and execution.
  • Leverage AI-assisted engineering workflows to accelerate platform development, operational automation, and engineering productivity.
  • Translate ambiguous operational and organizational challenges into clear technical requirements, execution plans, and measurable outcomes.
  • Develop scalable reliability, risk management, and operational governance capabilities for Affirm’s production systems using observability tooling, automation, and AI.
  • Build foundational systems that help engineers understand service ownership, dependencies, and operational topology across Affirm’s infrastructure and applications.
  • Drive alignment across Platform Engineering, SRE, Infrastructure, and product engineering teams to define requirements, prioritize investments, and deliver long-term technical roadmap outcomes.
  • Drive execution and delivery for critical cross-functional reliability initiatives with high organizational visibility.
  • Collaborate with cross-functional teams to identify and mitigate reliability risks.
  • Develop and maintain technical roadmaps that align with the company’s overall strategy.
  • Foster a culture of continuous learning and improvement within the team.

What We Are Looking For

  • 7+ years of experience in backend or full stack engineering, with 2+ years of engineering leadership experience.
  • Experience working within or alongside Production Engineering or Site Reliability Engineering teams.
  • Expertise with observability tools and building software or managing programs that drive engineering culture and reliability practices.
  • Strong operational judgment and ability to drive clarity, prioritization, and execution in ambiguous problem spaces.
  • Experience building internal platforms, developer tooling, reliability products, or operational automation systems at scale.
  • Demonstrated ability to balance rapid iteration and experimentation with operational rigor and long-term maintainability.
  • Strong programming background (e.g., Python, Kotlin, Java, or similar).
  • Strong communication and organizational leadership skills, with a track record of aligning stakeholders, driving execution, and influencing engineering practices across teams.
  • Ability to work effectively in a remote environment and lead remote teams.

Nice to Have

  • Experience with cloud-based infrastructure and containerization (e.g., Docker, Kubernetes).
  • Knowledge of AI and machine learning principles and their application in software engineering.
  • Experience with agile development methodologies and version control systems (e.g., Git).
  • Certification in software engineering, computer science, or a related field.

Benefits and Perks

  • Competitive salary and equity package.
  • Comprehensive health, dental, and vision insurance.
  • Flexible paid time off and remote work options.
  • Access to professional development opportunities and continuous learning resources.
  • Collaborative and dynamic work environment with a team of experienced professionals.
  • Opportunity to work on complex and challenging problems that impact the financial technology industry.
  • Recognition and reward for outstanding performance and contributions to the company’s mission.

How to Stand Out

  • To stand out in your application, emphasize your experience with reliability engineering, observability tools, and AI-assisted workflows.
  • Make sure your resume and cover letter are tailored to the role, highlighting specific examples of how you’ve driven reliability practices and scalable operational tooling in previous positions.
  • Prepare to discuss your approach to leading high-performing teams and driving a culture of innovation and accountability.
  • Be ready to provide specific examples of how you’ve balanced rapid iteration with operational rigor and long-term maintainability in your previous roles.
  • Consider creating a portfolio that showcases your technical skills and experience, especially if you’re transitioning from a non-traditional background.
  • Don’t hesitate to ask about the company culture, team dynamics, and opportunities for growth and development during the interview process.
  • Research Affirm’s products and services to understand how the Software Engineering Manager role contributes to the company’s mission and goals.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.