Senior Site Reliability Engineer, Infrastructure Foundations
WFA Digital Insight
The demand for skilled site reliability engineers has surged, with a 25% increase in job postings over the last year. In this role, you'll be part of a global team working on one of the world's most visited websites. With a focus on open-source technology and collaboration, this position stands out in the current remote job market. As a candidate, you should be prepared to showcase your expertise in infrastructure development, automation, and security. Before applying, consider the importance of working in a distributed team and the value of contributing to the world's knowledge.
Job Description
## About the Role As a Senior Site Reliability Engineer at Wikimedia Foundation, you will be responsible for ensuring the reliability and performance of the platform that serves Wikipedia to millions of users worldwide. This role is critical to the organization's mission to share knowledge globally. You will work closely with the product teams to design and implement scalable solutions, leveraging open-source technologies and collaborating with a diverse team of engineers.
The Wikimedia Foundation is a unique organization that values transparency, collaboration, and community engagement. As a member of the Site Reliability Engineering team, you will be part of a global, distributed team that works in an asynchronous communication environment. Your work will have a direct impact on the user experience and the overall mission of the organization.
## What You Will Do - Perform day-to-day operational and DevOps tasks on Wikimedia's public-facing infrastructure, including deployment, maintenance, configuration, and troubleshooting.
- Implement and utilize configuration management and deployment tools such as Puppet and Kubernetes.
- Lead continuous improvement efforts by automating the installation, configuration, and maintenance of services on the platform.
- Collaborate with product teams to design and implement new services, ensuring they operate at scale.
- Participate in a 24/7 on-call rotation, responding to incidents, diagnosing issues, and following up on system outages or alerts.
- Mentor peers in areas of technical and operational strength.
- Travel 1-2 times a year for in-person events and team meetings.
- Work closely with a global, cross-functional team in an asynchronous communication environment.
- Share knowledge and expertise with the open-source community.
- Experience with shell and scripting languages such as Python, Go, Bash, or Ruby.
- Familiarity with configuration management tools like Puppet or Ansible.
- Strong Linux system-level troubleshooting skills.
- Experience designing and managing infrastructure security for large fleets of diverse services.
- Strong English language skills, both verbal and written, and the ability to work independently as part of a globally distributed team.
- History of automating tasks and processes, identifying process gaps, and finding automation opportunities.
- Awareness of the current open-source infrastructure security landscape.
- Experience working with software security teams.
- Familiarity with credential management systems.
- Experience with immutable logging and auditing.
- Collaborative and dynamic work environment with a global team.
- Professional development opportunities, including training and conference attendance.
- Flexible working hours and remote work arrangements.
- Access to the latest technologies and tools.
- Comprehensive health insurance and retirement plans.
- Paid time off and holidays.
- Opportunity to contribute to the open-source community and share knowledge and expertise.
How to Stand Out
- Tip: Make sure to review the Wikimedia Foundation's open-source code and documentation before applying to demonstrate your interest and familiarity with the technology stack.
- Consider highlighting your experience with automation tools and configuration management in your resume and cover letter.
- Be prepared to discuss your approach to infrastructure security and how you stay up-to-date with the latest security threats and technologies.
- Showcase your ability to work independently and collaboratively in a distributed team environment.
- When preparing for the interview, focus on specific examples of your experience with DevOps tools, scripting languages, and Linux system administration.
- Be ready to discuss your experience with continuous improvement and automation, and how you've applied these principles in previous roles.
- Research the Wikimedia Foundation's values and mission, and be prepared to discuss how your own values and experience align with those of the organization.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.