Software Engineer, Data Infrastructure
WFA Digital Insight
As the demand for AI and machine learning specialists continues to skyrocket, with a 25% growth in the job market over the past year, the need for skilled software engineers who can build and maintain high-performance data infrastructure has never been more pressing. Companies like Cohere are at the forefront of this revolution, offering a unique opportunity for professionals to work on cutting-edge projects. With the global AI market expected to reach
Job Description
About the Role
The Software Engineer, Data Infrastructure role at Cohere is a unique opportunity for a skilled professional to join a team of researchers, engineers, and designers who are passionate about building and deploying frontier models for developers and enterprises. As a key member of the data infrastructure team, you will be responsible for building and maintaining the high-performance data layer that the modeling teams rely on for training and evaluation jobs. You will work directly with petabyte-scale storage infrastructure, tackling the networking and performance challenges that come with it. The role is based in New York and offers a remote-flexible work arrangement, allowing you to collaborate with a global team of experts.The data infrastructure team is critical to Cohere's mission to scale intelligence and serve humanity. By building and maintaining the data infrastructure, the team enables the modeling teams to focus on developing and deploying AI models that can power a wide range of applications, from content generation to semantic search. As a software engineer on this team, you will have the opportunity to work on complex, large-scale data storage infrastructure and collaborate with top researchers and engineers in the field.
What You Will Do
- Design, build, and maintain petabyte-scale storage infrastructure to support the modeling teams
- Collaborate with researchers and engineers to identify and prioritize data infrastructure needs
- Work on the networking and performance challenges associated with large-scale data storage
- Develop and implement data processing pipelines to support the modeling teams
- Collaborate with the data science team to develop and implement data quality checks
- Work with the DevOps team to ensure seamless integration of data infrastructure with other systems
- Participate in the development of data infrastructure roadmaps and strategy
- Collaborate with cross-functional teams to identify and prioritize data infrastructure needs
- Develop and maintain documentation of data infrastructure systems and processes
- Participate in on-call rotations to ensure 24/7 support of data infrastructure systems
What We Are Looking For
- 4+ years of experience working on data storage infrastructure
- Strong command of Python
- Experience with Kubernetes, especially on the storage side (Persistent Volumes, CSI drivers, etc.)
- Ability to transform unstructured data into performant datasets across diverse storage backends
- Experience with distributed data processing frameworks such as Apache Beam, Spark, or Flink
- Strong understanding of data infrastructure and storage systems
- Experience working with cloud-based storage systems such as S3, GCS, or Azure Blob Storage
- Strong communication and collaboration skills
- Ability to work in a fast-paced environment and adapt to changing priorities
Nice to Have
- Familiarity with modern analytics tooling such as BigQuery, Airflow, or dbt
- Experience working with AI and machine learning models
- Knowledge of containerization using Docker
- Experience working with agile development methodologies
- Certification in data engineering or a related field
Benefits and Perks
- Remote-flexible work arrangement
- Opportunity to work with a global team of experts in AI and machine learning
- Collaborative and dynamic work environment
- Professional development opportunities
- Access to cutting-edge technologies and tools
- Comprehensive health and dental benefits
- Generous parental leave policy
- Flexible vacation policy
- Co-working stipend
- Opportunities for career growth and advancement
How to Stand Out
- Be prepared to discuss your experience working with large-scale data storage infrastructure and your approach to troubleshooting complex data issues.
- Familiarize yourself with Cohere's products and services, and be prepared to discuss how your skills and experience align with the company's mission and goals.
- Highlight your ability to work collaboratively with cross-functional teams, including researchers, engineers, and data scientists.
- Emphasize your strong understanding of data infrastructure and storage systems, as well as your experience working with cloud-based storage systems.
- Be prepared to discuss your experience working with distributed data processing frameworks and your approach to data quality checks.
- Show enthusiasm for working in a fast-paced environment and adapting to changing priorities, and highlight your ability to communicate complex technical concepts to non-technical stakeholders.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.