Staff Software Engineer, Billing
WFA Digital Insight
As the demand for skilled software engineers in cloud computing and AI-assisted infrastructure continues to surge, with a notable 25% increase in job postings over the last year, Docker's commitment to innovation stands out. With over 20 million monthly users, Docker is at the forefront of the shift towards autonomous workflows, making this Staff Software Engineer, Billing role particularly intriguing. Given the current job market, where experience in AWS, Terraform, and observability systems is highly prized, candidates with a strong background in these areas are well-positioned. Before applying, it's essential for candidates to understand the evolving landscape of software development and the critical role AI agents play in it.
Job Description
## About the Role The Staff Software Engineer, Billing, will be a pivotal member of Docker's team, responsible for designing, maintaining, and evolving the infrastructure that supports the company's billing platform. This includes compute, storage, networking, CI/CD, and observability, all of which are crucial for the smooth operation of Docker's services. The role is deeply intertwined with the company's shift towards AI-assisted infrastructure, where the candidate will be at the forefront of defining what safe, observable, and efficient operations look like. Given Docker's user base of over 20 million monthly users and its position as a leader in developer tooling, the stability and reliability of its billing systems are of paramount importance. The chosen candidate will be responsible for ensuring that these systems are not only robust but also scalable, capable of handling the company's growing user base without compromising on performance. The role will involve working closely with software engineers on service design, bringing an infrastructure perspective to the table to ensure that services are designed with operational efficiency and scalability in mind from the outset. Additionally, the candidate will be involved in mentoring other engineers, spreading best practices, and driving improvements across the team.
## What You Will Do - Own and evolve the infrastructure supporting Billing Platform services, encompassing compute, storage, networking, CI/CD, and observability.
- Design and maintain IaC (Terraform) for billing system infrastructure on AWS, setting module patterns and standards for the team.
- Build and own observability systems — metrics, logging, alerting — with a focus on billing accuracy and payment reliability.
- Define deployment patterns and runbooks that work well in an AI-agent-assisted development workflow, including clear rollback procedures and safe promotion gates.
- Partner with software engineers on service design, bringing infrastructure constraints and operational requirements into the conversation early on.
- Identify systemic risks and drive improvements that span team or organizational boundaries.
- Lead incident response for billing system issues, owning the on-call rotation and postmortem process.
- Mentor engineers across the team, with your technical judgment helping to set a high standard for everyone.
- Develop and maintain local environments, CI/CD pipelines, and deployment tooling to make the developer experience faster and more reliable.
- Ensure that all infrastructure and system designs embody a security-first mindset, including threat modeling, blast radius analysis, least privilege by default, and comprehensive audit trails.
- Deep expertise in AWS, including ECS or EKS, RDS (Postgres preferred), networking, IAM, and cost management.
- Expert-level proficiency in Terraform, with experience designing reusable module patterns and setting standards.
- Experience building and owning observability stacks (Datadog, Grafana, or similar) at an organizational level.
- Strong familiarity with CI/CD systems — Jenkins, GitHub Actions, or equivalent — including pipeline design and developer experience ownership.
- Experience with Kubernetes at an operational and architectural level.
- A track record of identifying systemic risks and driving improvements that span team or organizational boundaries.
- A security-first mindset with experience in threat modeling, blast radius analysis, least privilege by default, and audit trails as a design requirement.
- Strong written English communication skills, as this is crucial for scaling influence across teams.
- Familiarity with AI-assisted development workflows and the challenges they pose to infrastructure design.
- Knowledge of Docker's suite of products, including Docker Desktop, Docker Hub, and Docker Scout.
- Participation in open-source projects or a strong GitHub presence.
- Certification in relevant technologies, such as AWS Certified Solutions Architect or similar.
- Opportunities for professional growth and development within a globally distributed team.
- Flexible working hours and remote work arrangement, allowing for a better work-life balance.
- Access to the latest technologies and tools, ensuring you stay at the forefront of your field.
- Health insurance, retirement plans, and other benefits tailored to support your well-being.
- Generous PTO policy, recognizing the importance of rest and relaxation.
- A stipend for remote work setup and productivity tools.
- Participation in a dynamic community of professionals who are shaping the future of software development.
How to Stand Out
- Tailor Your Resume: Ensure your resume highlights your experience with AWS, Terraform, and observability systems, as these are key requirements for the role.
- Prepare for Technical Questions: Brush up on your knowledge of Kubernetes, CI/CD systems, and security best practices, as these topics are likely to come up during interviews.
- Showcase Your Problem-Solving Skills: During the interview, highlight instances where you identified systemic risks and drove improvements in previous roles.
- Demonstrate Your Understanding of AI-Assisted Workflows: Be prepared to discuss how you've worked with or adapted to AI-assisted development workflows and what you see as the future of infrastructure design in this context.
- Ask About the Team and Culture: Use your interviews as an opportunity to learn more about Docker's team dynamics, remote work culture, and how the company supports professional development and growth.
- Be Ready to Discuss Portfolio and Projects: Come prepared to talk about your past projects, especially those that demonstrate your skills in infrastructure design, observability, and security.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.