Senior Solutions Architect, Infiniband and Networking Ethernet

NVIDIANVIDIA·Remote(Germany)
Software Development
Excel

WFA Digital Insight

As demand for AI and HPC solutions surges, companies like NVIDIA are driving innovation in the field. With a 25% increase in cloud-based AI adoption in 2025, the need for skilled solutions architects is on the rise. NVIDIA's commitment to pushing technological boundaries makes this role an exciting opportunity for those passionate about networking and AI. Candidates should be prepared to leverage their expertise in Infiniband and Ethernet to drive project success and customer satisfaction. Before applying, it's essential to understand the rapidly evolving landscape of AI and HPC and how NVIDIA is leading the charge.

Job Description

About the Role

The Senior Solutions Architect role at NVIDIA is a dynamic and customer-focused position that requires excellent interpersonal skills. As a key member of the NVIDIA Infrastructure Specialist Team, you will be responsible for analyzing, defining, and implementing large-scale networking projects. Your expertise in Infiniband and Ethernet will be crucial in driving project success and ensuring seamless integration with NVIDIA's cutting-edge AI and HPC solutions.

The role entails working closely with customers, partners, and internal teams to design and implement AI/HPC infrastructure for new and existing customers. You will be the face of the company, providing exceptional customer service and support. Your primary responsibilities will include building and optimizing AI/HPC infrastructure, supporting operational and reliability aspects of large-scale AI clusters, and engaging in the entire lifecycle of services.

NVIDIA is committed to innovation and excellence, and this role is an opportunity to be part of a team that is shaping the future of AI and HPC. With a strong focus on customer needs and satisfaction, you will be expected to provide feedback to internal teams and drive continuous improvement.

What You Will Do

  • Build and optimize AI/HPC infrastructure for new and existing customers
  • Support operational and reliability aspects of large-scale AI clusters
  • Engage in the entire lifecycle of services, from inception and design to deployment, operation, and refinement
  • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
  • Provide feedback to internal teams, including opening bugs, documenting workarounds, and suggesting improvements
  • Collaborate with customers, partners, and internal teams to analyze, define, and implement large-scale networking projects
  • Develop and implement automated network provisioning solutions using tools like Ansible, Salt, and Python
  • Design and develop CI/CD pipelines for network operations
  • Stay up-to-date with the latest advancements in AI, HPC, and networking technologies

What We Are Looking For

  • BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields
  • At least 8 years of professional experience in networking fundamentals, TCP/IP stack, and data center architecture
  • Proficiency in configuring, testing, validating, and resolving issues in LAN and InfiniBand networks
  • Advanced knowledge of EVPN, BGP, OSPF, VXLAN protocols
  • Hands-on experience with network switch/router platforms like Cumulus Linux, SONiC, IOS, JunosOS, and EOS
  • Extensive experience delivering automated network provisioning solutions using tools like Ansible, Salt, and Python
  • Strong focus on customer needs and satisfaction
  • Self-motivated with leadership skills to work collaboratively with customers and internal teams
  • Strong written, verbal, and listening skills in English

Nice to Have

  • Familiarity with cloud networks (AWS, GCP, Azure)
  • Linux or Networking Certifications
  • Experience with High-performance computing architectures
  • Understanding of how job schedulers (Slurm, PBS) work
  • Luster management technologies knowledge (bonus credit for BCM (Base Command Manager))
  • Experience with GPU (Graphics Processing Unit) focused hardware/software

Benefits and Perks

  • Competitive salary and benefits package
  • Opportunity to work with cutting-edge AI and HPC technologies
  • Collaborative and dynamic work environment
  • Professional development and growth opportunities
  • Flexible working hours and remote work options
  • Access to NVIDIA's state-of-the-art facilities and resources
  • Recognition and rewards for outstanding performance

How to Stand Out

  • Showcase your experience with Infiniband and Ethernet networking technologies in your resume and cover letter.
  • Be prepared to discuss your understanding of AI and HPC solutions and how they relate to NVIDIA's products and services.
  • Highlight your ability to work collaboratively with customers and internal teams to drive project success.
  • Emphasize your proficiency in automated network provisioning tools like Ansible, Salt, and Python.
  • Demonstrate your knowledge of cloud networks and high-performance computing architectures.
  • Be ready to provide specific examples of your experience with network switch/router platforms and CI/CD pipelines.
  • Research NVIDIA's company culture and values to understand their commitment to innovation and excellence.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.