Data Engineer

CohereCohere·Remote(New York)
Software Development

WFA Digital Insight

The demand for skilled data engineers in the AI sector has seen significant growth, with a reported 25% increase in 2025. Cohere's mission to scale intelligence is at the forefront of this trend. As a Data Engineer, you'll have the opportunity to work on production-grade data processing systems, collaborating with top researchers and engineers. With the rise of remote work, companies like Cohere are looking for talented individuals who can drive innovation and growth. Before applying, consider your experience with distributed data processing frameworks and modern analytics stacks, as well as your passion for AI research.

Job Description

About the Role

As a Data Engineer at Cohere, you will play a crucial role in building the foundational infrastructure for the company's AI systems. This includes working on storage infrastructure, product launches, and new customer experiences. You will be part of the Analytics & Data Insights team, where you will tackle complex problems and collaborate with top researchers and engineers. The role requires a strong command of Python and SQL, as well as experience with distributed data processing frameworks such as Apache Beam, Spark, or Flink.

The team at Cohere is passionate about their craft, and each member is one of the best in the world at what they do. As a Data Engineer, you will have the opportunity to work on cutting-edge projects, launch new products, and help enterprises understand the value of AI. You will be working directly with researchers and engineers who are at the forefront of AI research, and your recommendations will feed directly into products and strategy.

Cohere's mission is to scale intelligence to serve humanity, and as a Data Engineer, you will be instrumental in achieving this goal. The company is committed to creating a diverse and inclusive work environment, and values individuals who are passionate about AI and have a desire to build something genuinely new.

What You Will Do

  • Work directly on storage infrastructure, product launches, and new customer experiences built on one of the most advanced AI systems in the world
  • Collaborate daily with researchers and engineers who are some of the best in the world at what they do
  • Run implementations end-to-end and see initiatives through to real outcomes — no waiting around to be told what to do
  • Partner across research, marketing, sales, and finance to help define how Cohere grows, with your recommendations feeding directly into products and strategy
  • Design and implement data processing pipelines using Python and SQL
  • Work with distributed data processing frameworks such as Apache Beam, Spark, or Flink
  • Transform unstructured data into performant datasets across diverse storage backends including S3, GCS, and POSIX
  • Collaborate with the research team to integrate AI models into the data processing pipeline
  • Develop and maintain data visualizations and reports to support business decisions
  • Participate in code reviews and contribute to the improvement of the codebase
  • Stay up-to-date with the latest developments in AI research and apply this knowledge to improve the data processing pipeline

What We Are Looking For

  • 5+ years of experience working on production-grade data processing systems
  • Strong command of Python and SQL
  • Experience with distributed data processing frameworks such as Apache Beam, Spark, or Flink
  • Ability to transform unstructured data into performant datasets across diverse storage backends including S3, GCS, and POSIX
  • Experience with modern orchestration platforms, especially Kubernetes
  • Familiarity with modern analytics stack tooling such as BigQuery, Airflow, or dbt
  • Knowledge of Java or Golang
  • Genuine excitement about AI — you follow the research, have opinions, and enjoy being in the weeds
  • Comfort operating at the edge of what's known, with a desire to build something genuinely new rather than optimize what already exists
  • Strong communication and collaboration skills
  • Ability to work in a fast-paced environment and prioritize tasks effectively

Nice to Have

  • Experience with cloud-based data storage solutions such as AWS or GCP
  • Familiarity with containerization using Docker
  • Knowledge of Agile development methodologies
  • Experience with data visualization tools such as Tableau or Power BI

Benefits and Perks

  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London, and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days)

How to Stand Out

  • To stand out, highlight your experience with distributed data processing frameworks and modern analytics stacks in your resume and cover letter.
  • Be prepared to discuss your experience with AI research and how you stay up-to-date with the latest developments in the field.
  • Showcase your ability to transform unstructured data into performant datasets across diverse storage backends.
  • Emphasize your strong communication and collaboration skills, as you will be working closely with researchers and engineers.
  • Be prepared to discuss your experience with cloud-based data storage solutions and containerization using Docker.
  • Show your passion for AI and your desire to build something genuinely new rather than optimize what already exists.
  • Research the company's mission and values, and be prepared to discuss how your skills and experience align with them.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.