Site Reliability Engineer (Pacific timezone

Software Development

Adjust

WFA Digital Insight

The demand for skilled site reliability engineers has grown significantly in recent years, with a focus on ensuring high uptime and performance in complex systems. As companies like Posthog continue to innovate and expand their product offerings, the need for experts who can drive reliability and efficiency has become paramount. With the current remote job market offering more flexibility than ever, candidates with a strong background in AWS, automation, and stateful infrastructure are in high demand. Posthog's commitment to transparency, autonomy, and shipping fast makes it an attractive destination for talented engineers looking to make a real impact.

Job Description

About the Role

As a Site Reliability Engineer at Posthog, you will play a critical role in ensuring the reliability and performance of our production systems. This involves working closely with our engineering teams to identify areas for improvement, implementing automation solutions, and driving initiatives that enhance our overall system efficiency. The role is based in the Pacific timezone and offers the opportunity to work with a talented team of engineers who are passionate about delivering high-quality products.

Posthog's platform is designed to act as a co-pilot for product development, autonomously handling tasks such as code analysis, bug diagnosis, and change rollouts. Our suite of products includes PostHog Code, a built-in data warehouse, and PostHog AI, an AI-powered analyst that answers product questions and writes custom SQL queries. As a key member of our team, you will contribute to the development and maintenance of these products, ensuring they meet the highest standards of reliability and performance.

What You Will Do

Collaborate with engineering teams to identify and prioritize reliability improvements in our production systems
Design and implement automation solutions to enhance system efficiency and reduce downtime
Work with stateful infrastructure and AWS to ensure high availability and scalability
Develop and maintain scripts and tools to automate routine tasks and improve overall system reliability
Participate in on-call rotations to ensure 24/7 coverage of our production systems
Analyze system performance and identify areas for optimization
Implement monitoring and logging solutions to improve visibility into system performance
Collaborate with cross-functional teams to drive initiatives that enhance overall system reliability and performance
Stay up-to-date with the latest technologies and trends in site reliability engineering and apply this knowledge to continuously improve our systems

What We Are Looking For

Experience working with AWS, VMs, and automation tools such as Ansible or Terraform
Strong understanding of stateful infrastructure and how to ensure its reliability and performance
Excellent problem-solving skills, with the ability to analyze complex system issues and develop effective solutions
Experience working in a remote environment and collaborating with distributed teams
Strong communication and interpersonal skills, with the ability to work effectively with cross-functional teams
Experience with monitoring and logging tools such as Prometheus and Grafana
Knowledge of security best practices and how to apply them in a production environment
Experience with CI/CD pipelines and how to optimize them for reliability and performance

Nice to Have

Experience working with AI-powered tools and how to integrate them into production systems
Knowledge of containerization technologies such as Docker and Kubernetes
Experience working with agile development methodologies and how to apply them in a reliability engineering context
Certification in AWS or another relevant technology

Benefits and Perks

Competitive compensation package
Opportunity to work with a talented team of engineers who are passionate about delivering high-quality products
Flexible working hours and remote work options
Professional development opportunities, including training and conference attendance
Access to the latest technologies and tools, including AI-powered development platforms
Comprehensive health and wellness package, including mental health support
Generous PTO policy and paid holidays
Remote stipend to support your home office setup

How to Stand Out

Make sure your resume and cover letter are tailored to the specific requirements of the role, highlighting your experience with AWS, automation, and stateful infrastructure.
Prepare examples of how you have driven reliability and performance improvements in previous roles, and be ready to discuss your approach to problem-solving and system analysis.
Familiarize yourself with Posthog's products and technology stack, and be prepared to discuss how you can contribute to the company's mission and goals.
Practice your communication and interpersonal skills, as these are critical for success in a remote team environment.
Be prepared to ask questions about the company culture, team dynamics, and opportunities for growth and development.
Research the market rate for site reliability engineers in the Pacific timezone and be prepared to negotiate your salary based on your experience and qualifications.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.