Senior Research Scientist, Model Evaluation

Cohere·Remote(Toronto)

Data & Analytics

WFA Digital Insight

The demand for AI and machine learning experts has surged, with a notable 25% increase in job postings over the past year. As companies like Cohere push the boundaries of AI capabilities, the need for skilled professionals in model evaluation has become critical. With the rise of remote work, candidates now have more opportunities to join cutting-edge teams without geographical constraints. Cohere's commitment to innovation and diversity makes it an attractive destination for top talent. Before applying, candidates should be prepared to demonstrate their expertise in AI research, software engineering, and data analysis, as well as their ability to work collaboratively in a fast-paced environment.

Job Description

## About the Role As a Senior Research Scientist in Model Evaluation at Cohere, you will play a pivotal role in advancing the state-of-the-art in large language model (LLM) evaluation methods. Your primary focus will be on creating ambitious new evaluation benchmarks that push the limits of what Cohere's models can accomplish. This role is critical to Cohere's mission to scale intelligence and serve humanity through the development and deployment of frontier models for developers and enterprises. You will be part of a dynamic team of researchers, engineers, and designers who are passionate about their craft and committed to excellence.

The role of a Senior Research Scientist at Cohere is multifaceted, involving both the development of new evaluation techniques and the collaboration with cross-functional teams to translate model feedback into trustworthy, repeatable evaluations. This position requires a deep understanding of AI research, software engineering, and data analysis, as well as the ability to communicate complex ideas effectively to both technical and non-technical stakeholders.

Cohere's team is known for its obsessiveness over what they build, and each member is expected to contribute significantly to increasing the capabilities of their models and the value they drive for customers. The company culture values diversity and inclusivity, recognizing that a diverse range of perspectives is essential for building great products.

## What You Will Do - Create and develop next-generation evaluation benchmarks for LLMs, focusing on pushing the boundaries of what these models can accomplish.

Collaborate with highly cross-functional teams to translate model feedback into evaluations that are both trustworthy and repeatable.
Conduct research aimed at advancing the state-of-the-art in LLM evaluation methods, including training LLM judges, refining LLM-based data synthesis pipelines, and improving evaluation efficiency.
Build scalable and reusable tools for digging into model performance, ensuring that these tools are adaptable to evolving model capabilities.
Work closely with the engineering team to integrate new evaluation methods and tools into the existing model development pipeline.
Develop and maintain detailed documentation of evaluation methods, tools, and results, ensuring that knowledge is shared effectively across the team.
Participate in the review of complex data and LLM outputs to ensure high data quality, providing insights that can inform model development.
Engage in continuous learning to stay updated with the latest advancements in AI research and model evaluation, applying this knowledge to improve Cohere's evaluation capabilities.
Collaborate with the product team to ensure that model evaluations are aligned with customer needs and expectations, contributing to the development of customer-facing products.
Contribute to the development of best practices and standards for model evaluation within the company and the broader AI research community.

## What We Are Looking For - A Ph.D. in Computer Science, Artificial Intelligence, or a related field, with a focus on AI, machine learning, or natural language processing.

Proven experience in AI research, particularly in the development and evaluation of large language models.
Strong software engineering skills, with the ability to design, develop, and deploy software applications.
Experience with data analysis and the ability to work with large datasets.
Excellent communication and collaboration skills, with the ability to work effectively in a team environment.
A deep understanding of machine learning principles and practices, including model development, training, and evaluation.
Familiarity with Agile development methodologies and version control systems such as Git.
Experience with cloud computing platforms and containerization technologies.
Strong problem-solving skills, with the ability to approach complex problems systematically.

## Nice to Have - Experience with DevOps practices and the deployment of models in production environments.

Knowledge of ethics in AI and the development of fair, transparent, and accountable AI systems.
Familiarity with explainability techniques for machine learning models.
Participation in open-source projects or contributions to AI research communities.
Experience in mentoring junior researchers or engineers, contributing to their growth and development.

## Benefits and Perks - Competitive salary and benefits package.

Opportunity to work on cutting-edge AI research projects with a talented team of professionals.
Collaborative and dynamic work environment that fosters innovation and creativity.
Comprehensive health and dental benefits, including mental health support.
Flexible work hours and remote work options, with a stipend for co-working spaces.
Professional development opportunities, including conference attendance and training programs.
Access to the latest technologies and tools in AI research and development.

How to Stand Out

- Stay updated with the latest advancements in AI research, particularly in areas related to large language models and model evaluation, to remain competitive.

Develop a strong portfolio that showcases your experience in AI research, software engineering, and data analysis, highlighting projects that demonstrate your capabilities in model evaluation.
Prepare to discuss your research experience in detail, including your approach to problems, your experience with model development and evaluation, and how you stay current with industry advancements.
Emphasize your ability to work collaboratively in a fast-paced environment, highlighting your experience with cross-functional teams and your ability to communicate complex ideas effectively.
Be prepared to negotiate your salary based on your experience and qualifications, considering factors such as cost of living, industry standards, and the company's compensation package.
Watch for red flags such as unrealistic expectations, lack of transparency about the company culture, or unclear paths for professional development and growth.
Show enthusiasm for the company's mission and your role in contributing to the development of frontier models, demonstrating how your skills and experience align with Cohere's goals.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.