Senior Site Reliability Engineer

Pavebank·Remote(Malaysia)
Software Development

WFA Digital Insight

As demand for reliable digital banking systems grows, the need for skilled Site Reliability Engineers increases. With a 25% rise in fintech investments in 2025, professionals with expertise in cloud infrastructure and distributed systems are in high demand. Pavebank, a pioneering programmable banking platform, stands out for its innovative approach. Before applying, candidates should understand the complexities of regulated fintech environments and the importance of collaboration between engineering and security teams.

Job Description

About the Role

The Senior Site Reliability Engineer will play a pivotal role in ensuring the scalability, performance, and reliability of Pavebank's core systems. This involves working closely with various teams to build robust infrastructure, automate operations, and maintain high standards of reliability across all services. The role is critical in directly impacting the safety, performance, and scalability of Pavebank's banking platform, thereby influencing customer trust in the brand.

In this position, the successful candidate will be part of a dynamic team that combines traditional banking with digital assets under a single, regulated platform. The environment is fast-paced and growth-oriented, requiring an individual who is not only technically skilled but also adaptable and proactive.

What You Will Do

  • Monitor, maintain, and improve the reliability, availability, and performance of production systems and services.
  • Build and maintain infrastructure as code (IaC), deployment pipelines, and automation to support continuous delivery, scalability, and disaster recovery.
  • Respond to incidents, perform root-cause analysis, and drive postmortems to ensure lessons learned are applied.
  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (with a preference for Google Cloud Platform), networking, and microservices orchestration.
  • Document operational runbooks, on-call procedures, and system architecture to support maintenance, knowledge sharing, and compliance.

What We Are Looking For

  • Strong programming or scripting skills (e.g., Go, Python, Bash) for automation, tooling, and operational tasks.
  • Hands-on experience with cloud infrastructure, ideally Google Cloud Platform (GCP).
  • Familiarity with containerization and orchestration (Docker, Kubernetes, or equivalent).
  • Experience with infrastructure-as-code tools (Terraform, Cloud Deployment Manager, or similar).
  • Experience with either FluxCD or ArgoCD for GitOps-based delivery.
  • Solid understanding of distributed systems, microservices architecture, and reliability patterns.
  • Experience setting up monitoring, logging, alerting, and observability (e.g., Prometheus, Grafana, ELK, distributed tracing).
  • Strong troubleshooting skills and the ability to respond to incidents under pressure.
  • Knowledge of backup and disaster recovery strategies, database management, and secure operations.

Nice to Have

  • Prior experience in fintech, banking, or other highly regulated industries.
  • Familiarity with compliance, security, and data protection best practices.
  • Experience with high-availability, high-throughput systems, or financial infrastructure.
  • Exposure to blockchain or crypto systems integrated with banking.

Benefits and Perks

  • Competitive salary and meaningful equity with room for growth.
  • Opportunity to work alongside a founding team from Monzo and BigPay, bringing top-tier fintech expertise.
  • Chance to tackle real-world reliability challenges in a regulated, fast-growing fintech environment.
  • Learn from and collaborate with experienced engineers while developing your SRE career.
  • Be part of a well-funded startup shaping the future of programmable banking.
  • Remote work arrangements for a better work-life balance.
  • Professional development opportunities through training and conference attendance.
  • Comprehensive health insurance and retirement plans.

How to Stand Out

  • Emphasize your experience with cloud infrastructure, especially if you have worked with Google Cloud Platform, as it is preferred by Pavebank.
  • Highlight any background you have in fintech or regulated industries, as these are valuable in the context of programmable banking.
  • Showcase your ability to work independently and as part of a team, given the collaborative nature of the role.
  • Prepare to discuss specific scenarios where you improved system reliability, scalability, or performance, and how you approached these challenges.
  • Be ready to talk about your understanding of compliance, security, and data protection best practices in a fintech environment.
  • Demonstrate your problem-solving skills, especially under pressure, as responding to incidents is a critical part of the job.
  • Consider creating a portfolio or preparing examples that demonstrate your technical skills, such as infrastructure as code projects or contributions to open-source projects related to reliability engineering.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.