Site Reliability Engineer (Pacific timezone
WFA Digital Insight
The demand for skilled site reliability engineers has grown significantly in recent years, with a focus on ensuring high uptime and performance in complex systems. As companies like Posthog continue to innovate and expand their product offerings, the need for experts who can drive reliability and efficiency has become paramount. With the current remote job market offering more flexibility than ever, candidates with a strong background in AWS, automation, and stateful infrastructure are in high demand. Posthog's commitment to transparency, autonomy, and shipping fast makes it an attractive destination for talented engineers looking to make a real impact.
Job Description
About the Role
As a Site Reliability Engineer at Posthog, you will play a critical role in ensuring the reliability and performance of our production systems. This involves working closely with our engineering teams to identify areas for improvement, implementing automation solutions, and driving initiatives that enhance our overall system efficiency. The role is based in the Pacific timezone and offers the opportunity to work with a talented team of engineers who are passionate about delivering high-quality products.Posthog's platform is designed to act as a co-pilot for product development, autonomously handling tasks such as code analysis, bug diagnosis, and change rollouts. Our suite of products includes PostHog Code, a built-in data warehouse, and PostHog AI, an AI-powered analyst that answers product questions and writes custom SQL queries. As a key member of our team, you will contribute to the development and maintenance of these products, ensuring they meet the highest standards of reliability and performance.
What You Will Do
- Collaborate with engineering teams to identify and prioritize reliability improvements in our production systems
- Design and implement automation solutions to enhance system efficiency and reduce downtime
- Work with stateful infrastructure and AWS to ensure high availability and scalability
- Develop and maintain scripts and tools to automate routine tasks and improve overall system reliability
- Participate in on-call rotations to ensure 24/7 coverage of our production systems
- Analyze system performance and identify areas for optimization
- Implement monitoring and logging solutions to improve visibility into system performance
- Collaborate with cross-functional teams to drive initiatives that enhance overall system reliability and performance
- Stay up-to-date with the latest technologies and trends in site reliability engineering and apply this knowledge to continuously improve our systems
What We Are Looking For
- Experience working with AWS, VMs, and automation tools such as Ansible or Terraform
- Strong understanding of stateful infrastructure and how to ensure its reliability and performance
- Excellent problem-solving skills, with the ability to analyze complex system issues and develop effective solutions
- Experience working in a remote environment and collaborating with distributed teams
- Strong communication and interpersonal skills, with the ability to work effectively with cross-functional teams
- Experience with monitoring and logging tools such as Prometheus and Grafana
- Knowledge of security best practices and how to apply them in a production environment
- Experience with CI/CD pipelines and how to optimize them for reliability and performance
Nice to Have
- Experience working with AI-powered tools and how to integrate them into production systems
- Knowledge of containerization technologies such as Docker and Kubernetes
- Experience working with agile development methodologies and how to apply them in a reliability engineering context
- Certification in AWS or another relevant technology
Benefits and Perks
- Competitive compensation package
- Opportunity to work with a talented team of engineers who are passionate about delivering high-quality products
- Flexible working hours and remote work options
- Professional development opportunities, including training and conference attendance
- Access to the latest technologies and tools, including AI-powered development platforms
- Comprehensive health and wellness package, including mental health support
- Generous PTO policy and paid holidays
- Remote stipend to support your home office setup
How to Stand Out
- Make sure your resume and cover letter are tailored to the specific requirements of the role, highlighting your experience with AWS, automation, and stateful infrastructure.
- Prepare examples of how you have driven reliability and performance improvements in previous roles, and be ready to discuss your approach to problem-solving and system analysis.
- Familiarize yourself with Posthog's products and technology stack, and be prepared to discuss how you can contribute to the company's mission and goals.
- Practice your communication and interpersonal skills, as these are critical for success in a remote team environment.
- Be prepared to ask questions about the company culture, team dynamics, and opportunities for growth and development.
- Research the market rate for site reliability engineers in the Pacific timezone and be prepared to negotiate your salary based on your experience and qualifications.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.