Member of Technical Staff, Data Analysis and Evaluation

CohereCohere·Remote(London)
Other
Excel

WFA Digital Insight

As demand for AI and machine learning specialists continues to soar, with over 50% of companies investing in these technologies, the need for skilled data analysis and evaluation professionals has never been greater. Cohere, a pioneer in large language models, is at the forefront of this movement. With a strong focus on innovation and a commitment to diversity, this company stands out in the remote job market. To succeed in this role, candidates will need to possess a unique blend of statistical expertise, software engineering skills, and experience with machine learning frameworks. Before applying, it's essential to understand the company's mission and values, as well as the specific requirements of the position, including a strong foundation in statistics and experience with data collection tasks.

Job Description

About the Role

The Member of Technical Staff, Data Analysis and Evaluation role at Cohere is a pivotal position that plays a crucial part in ensuring the quality, reliability, and performance of the company's large language models. As a key member of the team, you will be responsible for designing and conducting data collection tasks, assessing and evaluating dataset quality, and analyzing the robustness and generalisability of the models. This role requires a unique blend of technical expertise, including statistics, experimental design, and machine learning, as well as excellent communication skills to collaborate effectively with cross-functional teams.

The success of Cohere's mission to scale intelligence and serve humanity relies heavily on the quality of its large language models. As such, this role is essential to the company's overall strategy and goals. You will be working closely with a team of researchers, engineers, and data annotators to conduct data-driven decision-making and improve the overall effectiveness of the AI systems.

What You Will Do

  • Design and oversee data collection tasks, including supporting human annotators and ensuring data quality
  • Develop and apply statistical methods to evaluate the quality and reliability of datasets
  • Analyze and assess the generalisability and robustness of ML systems across diverse use cases
  • Collaborate with teams to improve dataset quality and model performance
  • Train and fine-tune large language models (LLMs) on distributed training infrastructures
  • Conduct experiments to evaluate model performance and identify areas for improvement
  • Develop and maintain tools and scripts to automate data processing and analysis tasks
  • Work closely with researchers and engineers to integrate new models and techniques into the company's AI systems
  • Communicate findings and results to both technical and non-technical stakeholders

What We Are Looking For

  • Extremely strong software engineering skills, with proficiency in programming languages such as Python
  • Strong expertise in designing and conducting data collection tasks, including working with human annotators
  • Strong statistical skills and experience evaluating scientific experiments related to data collection and model performance
  • Experience analyzing datasets with respect to their quality, biases, and suitability for training ML models
  • Hands-on experience training large language models (LLMs) on distributed training infrastructures
  • Familiarity with evaluating and improving the generalisability and robustness of ML systems
  • Excellent communication skills to collaborate effectively with cross-functional teams and present findings
  • One or more papers at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP)
  • Experience working in a fast-paced, dynamic environment with a high level of autonomy

Nice to Have

  • Experience with cloud-based data storage and processing platforms
  • Familiarity with agile development methodologies and version control systems
  • Knowledge of data visualization tools and techniques

Benefits and Perks

  • Competitive salary and equity package
  • Comprehensive health and wellness benefits
  • Flexible working hours and remote work options
  • Professional development opportunities and conference sponsorships
  • Access to cutting-edge technologies and tools
  • Collaborative and dynamic work environment with a team of experts in their field
  • Opportunities for career growth and advancement within the company

How to Stand Out

  • Develop a strong foundation in statistics and machine learning to stand out in this role, and be prepared to discuss your experience with data collection tasks and model evaluation.
  • Build a portfolio of your work, including any research papers or projects that demonstrate your expertise in data analysis and evaluation.
  • Practice your communication skills, as you will be working closely with cross-functional teams and presenting findings to both technical and non-technical stakeholders.
  • Stay up-to-date with industry trends and developments, including new technologies and techniques in the field of AI and machine learning.
  • Be prepared to discuss your experience with distributed training infrastructures and large language models.
  • Show enthusiasm and passion for the company's mission and values, and be prepared to discuss how you can contribute to the team's success.
  • Don't be afraid to ask questions during the interview process, and be prepared to discuss your salary expectations and any other benefits or perks you may be looking for.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.