MLOps Engineer

Bright Vision TechnologiesBright Vision Technologies·Remote(United States)
Data & Analytics

WFA Digital Insight

As demand for AI and machine learning specialists grows, roles like MLOps Engineer are becoming increasingly crucial. With a 25% increase in AI adoption in 2025, companies like Bright Vision Technologies are looking for experts to design and operate high-performance inference platforms. To stand out, candidates need strong distributed systems and performance engineering expertise, as well as experience with machine learning serving. Before applying, consider the company's focus on innovation and scalability, and be prepared to showcase your technical abilities.

Job Description

About the Role

The MLOps Engineer role at Bright Vision Technologies is a unique opportunity to work on designing, building, and operating high-performance inference platforms for serving large machine learning models in production. This role focuses on the systems engineering side of AI deployment and requires strong distributed systems and performance engineering expertise. As an MLOps Engineer, you will be part of a dynamic team that is dedicated to building innovative solutions to help businesses automate and optimize their operations.

The day-to-day responsibilities of this role will involve working closely with cross-functional teams to design and operate model serving platforms that support diverse workloads, including LLMs, vision models, and recommendation systems. You will need to optimize inference performance, implement multi-tenant routing, and build autoscaling and capacity management systems that balance latency, throughput, and cost.

Bright Vision Technologies is a forward-thinking software development company that leverages cutting-edge technologies to create scalable, secure, and user-friendly applications. As a member of this team, you will have the opportunity to contribute to the company's mission of transforming business processes through technology and work on projects that have a real impact on the industry.

What You Will Do

  • Design and operate model serving platforms supporting diverse workloads, including LLMs, vision models, and recommendation systems
  • Optimize inference performance using continuous batching, paged attention, speculative decoding, and request multiplexing
  • Implement multi-tenant routing, rate limiting, and quality-of-service policies across model endpoints
  • Build autoscaling and capacity management systems that balance latency, throughput, and cost
  • Tune GPU utilization, memory management, and KV cache strategies for LLM serving workloads
  • Integrate model serving with API gateways, identity systems, and observability platforms
  • Implement caching, prompt deduplication, and response reuse strategies where appropriate
  • Drive end-to-end observability, including latency histograms, queue dynamics, GPU utilization, and error tracking
  • Develop deployment workflows, including canary releases, shadow testing, and automated rollback
  • Operate incident response for high-availability AI services and drive durable reliability improvements
  • Collaborate with ML and product teams to support new model releases and capability rollouts
  • Implement security controls, including request signing, content filtering, and abuse detection at the serving layer

What We Are Looking For

  • Bachelor's or Master's degree in Computer Science or a related field
  • Six or more years of experience in a related field, with a strong focus on distributed systems and performance engineering
  • Experience with machine learning serving and model deployment
  • Strong expertise in designing and operating high-performance inference platforms
  • Experience with cloud-based technologies, such as AWS or Azure
  • Strong understanding of trade-offs between latency, throughput, cost, and quality in ML serving
  • Experience with containerization, orchestration, and automation tools
  • Strong communication and collaboration skills

Nice to Have

  • Experience with natural language processing and computer vision
  • Knowledge of DevOps practices and tools
  • Experience with agile development methodologies
  • Certification in a related field, such as machine learning or cloud computing

Benefits and Perks

  • Competitive salary and benefits package
  • Opportunity to work with a dynamic and innovative company
  • Collaborative and supportive team environment
  • Professional development and growth opportunities
  • Flexible working hours and remote work options
  • Access to cutting-edge technologies and tools
  • Recognition and reward for outstanding performance
  • Comprehensive health insurance and retirement plan

How to Stand Out

  • Make sure you have a strong understanding of distributed systems and performance engineering, as well as experience with machine learning serving and model deployment.
  • Be prepared to showcase your technical abilities through a coding assessment or other evaluation methods.
  • Highlight your experience with cloud-based technologies and containerization, orchestration, and automation tools.
  • Emphasize your ability to collaborate and communicate effectively with cross-functional teams.
  • Consider learning more about the company's focus on innovation and scalability, and be prepared to discuss how your skills and experience align with these goals.
  • Be prepared to discuss your experience with security controls and implementation of request signing, content filtering, and abuse detection.
  • Research the company's products and services, and be prepared to discuss how you can contribute to the company's mission and goals.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.