Senior Site Reliability Engineer
WFA Digital Insight
As businesses increasingly prioritize operational efficiency, the demand for Site Reliability Engineers (SREs) is surging. In fact, the need for skilled SREs has grown by over 30% in recent years, reflecting the pivotal role they play in tech-savvy organizations like Celonis. This remote role offers a compelling chance for seasoned professionals to apply their expertise in monitoring, automation, and incident management while contributing to innovative process intelligence solutions. Candidates seeking this position should bring deep experience in distributed systems, cloud platforms, and a collaborative spirit to navigate the dynamic tech landscape effectively.
Job Description
About the Role
As a Senior Site Reliability Engineer at Celonis, you will be responsible for ensuring the health and reliability of our product services, using critical engineering principles to maintain high standards of performance.Responsibilities
- Improve monitoring and metrics for all Celonis services.
- Define and implement Service Level Objectives (SLOs).
- Develop processes and automations to prevent recurring issues.
- Drive knowledge sharing and amplifying SRE culture across the organization.
- Own and enhance the incident management process, focusing on blameless post-mortems.
- Collaborate with various teams to engineer reliable and resilient services.
Requirements
- Minimum of 8+ years experience in Site Reliability Engineering and software development.
- Proficient in programming with Java, Spring Boot, and familiarity with Python or similar scripting languages in a Linux environment.
- Strong background in large-scale distributed systems.
- In-depth knowledge of Kubernetes and major cloud platforms such as AWS and Azure.
- Experience with monitoring solutions, for example, Datadog.
Nice to Have
- Excellent communication and collaboration skills.
- A proven track record of driving projects and a positive, can-do attitude.
Benefits
- Participate in pioneering innovative process mining technology.
- Clear career paths, dedicated learning programs, and mentorship opportunities.
- Comprehensive benefits including PTO, hybrid working options, and company equity.
How to Stand Out
- Familiarize yourself with Kubernetes and popular cloud services like AWS and Azure to stand out.
- Highlight your past experience with monitoring tools like Datadog in your application.
- Prepare examples of how you’ve driven improvements in reliability within previous roles for your interview.
- Consider obtaining relevant certifications to boost your qualifications.
- Be ready to discuss your approach to incident management and how you've implemented blameless post-mortems.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.