Engineering Manager, HADR
WFA Digital Insight
The demand for skilled engineering managers in high-availability systems has surged, with a 25% increase in job postings over the past year. As companies like Stripe continue to grow and scale, the need for experts who can lead teams and build resilient infrastructure becomes increasingly important. With the rise of remote work, companies are looking for candidates who can manage distributed teams and drive results in a fast-paced environment. Stripe's commitment to increasing the GDP of the internet through innovative financial infrastructure makes this role particularly compelling for those passionate about making a global impact.
Job Description
About the Role
The Engineering Manager, HADR position at Stripe is a unique opportunity to lead a team of talented engineers in designing and building high-availability and disaster recovery systems. As a key member of the High Availability and Disaster Recovery team, you will be responsible for developing and implementing solutions that enable Stripe's products to survive any type of disaster. This team is creating greenfield solutions that will serve as the basis for Stripe's architecture for years to come.The role requires a deep understanding of distributed systems, latency-critical applications, and data redundancy. You will work closely with cross-functional teams to drive the execution of projects, overseeing the entire development lifecycle from planning to delivery. Your expertise in technical proficiency will enable you to engage deeply with Staff-level engineers on system architecture, API design, and AI model integration.
What You Will Do
- Lead and manage a team of talented engineers on the High Availability and Disaster Recovery team
- Develop and implement solutions to enable latency-critical, stateful applications to survive any type of disaster
- Design and build distributed systems on top of unreliable architecture to provide highly available and resilient customer solutions
- Work on latency-critical solutions where every millisecond matters and data redundancy is a hard requirement
- Learn quickly and work on a broad range of problems, including investigating Mongo write concerns and minimizing cross-region TLS handshakes
- Develop new systems to automate disaster detection and failovers
- Drive the execution of projects, overseeing the entire development lifecycle from planning to delivery
- Maintain high standards of quality and timely completion
- Help influence peers and managers and build consensus while dealing with ambiguity
- Build your team, formalizing role definitions, defining charter and ownership boundaries, and taking a newly formed team into a high-functioning one
What We Are Looking For
- 5+ years of engineering management experience, with a proven track record of managing and growing teams of 10+ engineers
- Deep domain expertise in cloud development or a strong background in building sophisticated infrastructure systems
- Exceptional communication skills, with the ability to distill complex technical concepts into clear strategic frameworks for peers and executives
- Experience in rapid-growth environments, specifically a history of successful, high-volume hiring and team scaling
- Technical proficiency to engage deeply with Staff-level engineers on system architecture, API design, and AI model integration
- Understanding of distributed system concepts, such as leader election, voting, and quorum
- Background in high-availability systems, chaos engineering, or related fields
Nice to Have
- Experience with cloud-based infrastructure and distributed systems
- Knowledge of programming languages, such as Java, Python, or C++
- Familiarity with containerization tools, such as Docker
- Experience with agile development methodologies and version control systems, such as Git
Benefits and Perks
- Competitive salary and equity package
- Comprehensive health, dental, and vision insurance
- Flexible paid time off and holidays
- Remote work stipend and home office setup
- Professional development opportunities and conference sponsorships
- Access to cutting-edge technologies and innovative projects
- Collaborative and dynamic work environment with a team of talented engineers
How to Stand Out
- Develop a strong understanding of distributed systems and latency-critical applications to stand out in the interview process
- Showcase your experience in managing and growing high-performing teams, with a focus on technical proficiency and leadership skills
- Prepare to discuss your approach to building consensus and influencing peers and managers in a fast-paced environment
- Be ready to provide examples of your experience with cloud-based infrastructure and distributed systems, and how you have applied these skills in previous roles
- Highlight your ability to communicate complex technical concepts to non-technical stakeholders, and your experience with agile development methodologies and version control systems
- Research Stripe's company culture and values to demonstrate your passion for the company's mission and vision
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.