Data/Infrastructure Advocate Engineer - EMEA Remote
WFA Digital Insight
The demand for skilled data and infrastructure professionals has skyrocketed, with the global data engineering market expected to grow by 25% annually. As a Data/Infrastructure Advocate Engineer at Hugging Face, you'll play a crucial role in shaping the future of open data workflows. With a strong background in developer relations and a passion for open-source, you'll thrive in this fast-paced environment. Before applying, consider your experience in data engineering, infrastructure, and community building, as well as your ability to communicate complex technical concepts.
Job Description
About the Role
As a Data/Infrastructure Advocate Engineer at Hugging Face, you will be the bridge between cutting-edge data infrastructure and the global community of data engineers, researchers, and developers. Your primary focus will be on championing Xet storage on the Hugging Face Hub, enabling users to efficiently store, version, and collaborate on large-scale datasets. This role requires a unique blend of technical depth and community advocacy, helping to define the future of open data workflows.The Hugging Face platform has already made significant strides in democratizing access to AI, with over 5 million users and 100k organizations sharing more than 1M models, 300k datasets, and 300k apps. As the first Data/Infrastructure Advocate Engineer, you will have the opportunity to make a lasting impact on the data engineering community.
You will collaborate with various teams, including Datasets, Hub, and Infrastructure, to shape how developers interact with data on the platform. Your goal will be to inspire a community to build better, faster, and more scalable data pipelines.
What You Will Do
- Grow and nurture the open-source data/infra community through initiatives, collaborations, and events
- Engage with communities like Apache Parquet, Open Table Formats, and data engineering forums to promote best practices and Hugging Face tools
- Promote the Hugging Face Hub as the go-to platform for data storage, versioning, and collaboration
- Curate and showcase datasets, benchmarks, and tools like Xet to demonstrate the Hub's value for data workflows
- Create demos, benchmarks, and tools (e.g., Colab notebooks) to illustrate best practices for data storage and versioning
- Produce high-quality tutorials, blog posts, and videos to make complex topics accessible
- Share insights on storage optimization, dataset versioning, and deduplication to empower developers
- Actively participate in online communities to highlight contributions, answer questions, and foster collaboration
- Ensure datasets and tools released on the Hub are well-documented with clear examples, benchmarks, and use cases
What We Are Looking For
- 3+ years of experience in developer relations or developer advocacy, ideally for data engineering, infrastructure, or ML tools and platforms
- Established public presence as a technical voice with a track record of regularly publishing data/infra/ML content
- Portfolio of developer-facing content, including tutorials, blog posts, videos, demos, benchmarks, or conference talks
- Hands-on experience building and engaging open-source or developer communities
- Strong Python skills and hands-on experience with data libraries like pandas, pyarrow, and huggingface/datasets
- Practical experience with storage systems and formats, such as Parquet, Open Table Formats, and S3
- Ability to explain complex technical topics clearly through writing, demos, or talks
- Fluent written and spoken English
Nice to Have
- Experience with the Hugging Face Hub and datasets ecosystem, or with Xet
- Open-source maintainer or contributor experience
- Familiarity with large-scale data pipelines and data engineering workflows
- Experience producing high-quality content, such as tutorials, blog posts, or videos
Benefits and Perks
- Opportunity to work with a fast-growing company and contribute to the development of the Hugging Face platform
- Collaborative and dynamic work environment with a team of experienced professionals
- Flexible remote work arrangements and a stipend for remote work setup
- Access to cutting-edge technologies and tools, including Xet and the Hugging Face Hub
- Opportunities for professional growth and development, including training and conference attendance
- Competitive compensation package, including salary, equity, and benefits
- Generous paid time off and holidays, as well as a comprehensive health insurance plan
How to Stand Out
- Develop a strong portfolio of developer-facing content, including tutorials, blog posts, and videos, to demonstrate your expertise in data engineering and infrastructure.
- Focus on building a strong public presence as a technical voice, including a track record of regularly publishing data/infra/ML content and engaging with online communities.
- Highlight your hands-on experience with data libraries, storage systems, and formats, as well as your ability to explain complex technical topics clearly.
- Emphasize your experience with open-source communities and your ability to build and engage with developer communities.
- Prepare to discuss your experience with large-scale data pipelines and data engineering workflows, as well as your ability to work with cutting-edge technologies like Xet and the Hugging Face Hub.
- Be prepared to provide specific examples of your work, including demos, benchmarks, and tools you have created to illustrate best practices for data storage and versioning.
- Research the company culture and values, and be prepared to discuss how your skills and experience align with the company's mission and goals.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.