Senior Site Reliability Engineer
WFA Digital Insight
As demand for cloud-native solutions grows, so does the need for skilled Site Reliability Engineers. With over 70% of companies embracing AI-powered technologies, professionals with expertise in system reliability and scalability are in high demand. UJET, a leader in AI-driven contact center innovation, is looking for a technical leader to build and scale their SRE function. This role stands out for its focus on establishing best practices and influencing engineering decisions. Before applying, candidates should be aware that they will be working in a fast-paced environment, driving automation and efficiency, and partnering with cross-functional teams to deliver exceptional customer experiences.
Job Description
About the Role
The Senior Site Reliability Engineer position at UJET is a technical leadership role that will be responsible for building and scaling the company's Site Reliability Engineering function. This is a high-impact role that will have a direct influence on the company's ability to deliver exceptional customer experiences through its AI-powered contact center platform. The successful candidate will be a seasoned professional with a deep understanding of system reliability, scalability, and performance.As a technical leader, the Senior Site Reliability Engineer will be responsible for designing and implementing strategies to improve system reliability, reduce operational toil, and establish best practices across engineering teams. This will involve collaborating with cross-functional teams, including product and platform teams, to ensure that engineering decisions are informed by reliability and scalability considerations.
UJET's AI-powered contact center platform is built on a cloud-native architecture, which provides a unique opportunity for the Senior Site Reliability Engineer to work with cutting-edge technologies and influence the company's technical direction. The company's focus on delivering exceptional customer experiences means that the Senior Site Reliability Engineer will be working in a fast-paced environment, driving automation and efficiency, and partnering with teams to deliver high-quality solutions.
What You Will Do
- Lead efforts to improve system reliability, scalability, and performance across critical services
- Define and implement SLIs/SLOs and error budgets, and use them to guide engineering priorities
- Design and develop observability systems, including metrics, logging, tracing, and alerting, to produce actionable alerts and data with minimal noise
- Lead complex incident response, acting as incident commander when needed
- Conduct postmortems focused on systemic causes rather than individual fault, and ensure corrective actions from those reviews are completed
- Identify and eliminate toil through automation, tooling, and improved workflows
- Partner with product and platform teams on architecture decisions, production readiness, and deployment strategies
- Develop and maintain technical documentation and runbooks to support system reliability and scalability
- Collaborate with engineering teams to ensure that reliability and scalability considerations are integrated into the software development lifecycle
What We Are Looking For
- 5+ years of experience in a Site Reliability Engineering or similar role
- Strong understanding of system reliability, scalability, and performance
- Experience with cloud-native architectures and technologies, such as AWS or Azure
- Proficiency in programming languages, such as Python, Java, or C++
- Experience with observability tools, such as Prometheus, Grafana, or New Relic
- Strong understanding of DevOps practices and tools, such as Jenkins, Docker, or Kubernetes
- Experience with Agile development methodologies and version control systems, such as Git
- Strong communication and collaboration skills, with the ability to work with cross-functional teams
- Experience with incident response and postmortem analysis
Nice to Have
- Experience with AI-powered technologies and machine learning algorithms
- Knowledge of cloud security and compliance frameworks, such as HIPAA or PCI-DSS
- Experience with IT service management frameworks, such as ITIL
- Certification in Site Reliability Engineering or a related field
Benefits and Perks
- Competitive salary and benefits package
- Opportunity to work with cutting-edge technologies and influence the company's technical direction
- Collaborative and dynamic work environment
- Flexible working hours and remote work options
- Professional development opportunities, including training and certification programs
- Access to the latest tools and technologies
- Recognition and reward programs for outstanding performance
- Comprehensive health and wellness programs, including mental health support
- Generous PTO and holiday schedule
- Employee stock options and equity participation
- Remote work stipend and home office setup support
How to Stand Out
- Tip: Highlight your experience with cloud-native architectures and technologies, such as AWS or Azure, in your resume and cover letter.
- Tip: Be prepared to discuss your approach to system reliability, scalability, and performance in the interview, using specific examples from your experience.
- Tip: Showcase your understanding of DevOps practices and tools, such as Jenkins, Docker, or Kubernetes, and be prepared to explain how you have applied them in previous roles.
- Tip: Emphasize your strong communication and collaboration skills, and provide examples of how you have worked with cross-functional teams to deliver high-quality solutions.
- Tip: Consider obtaining certification in Site Reliability Engineering or a related field to demonstrate your expertise and commitment to the field.
- Tip: Research UJET's company culture and values, and be prepared to discuss how your skills and experience align with their mission and vision.
- Tip: Ask informed questions during the interview, such as 'What are the biggest challenges facing the SRE team, and how do you see this role contributing to the company's technical direction?' to demonstrate your interest and engagement.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.