Post your job offer for free on H1BConnect with no upfront cost!

Logo

Hire with Us
NVIDIA Corporation logo

Engineering Manager - Rack Scale AI Systems

NVIDIA Corporation

8/3/2025

US, CA, Santa Clara

Full-time

Salary: $168,000 - $270,250 per year


Job Description

NVIDIA is seeking an Engineering Manager to lead their Cloud Platform Team focused on Rack Scale AI Systems within the IPP organization.

Requirements

  • Bachelor's or Master's Degree in Computer Science or Software Engineering, or equivalent experience
  • 5+ years of Management experience in large, cross-matrix, and geo-dispersed technology organizations focused in/around the server and data center space
  • Strong technical skills and understanding of embedded systems, orchestration & automation systems, data centers and cloud architecture
  • Deep understanding of cloud design in the areas of virtualization and global infrastructure, distributed systems, load balancing, and security
  • Excellent thought process for identifying risks and developing robust mitigation
  • Strong collaborative and interpersonal skills
  • Experience in large scale QA environments, high performance or large scale computing environments, parallel computing, or CUDA
  • Special skills in large-scale computing and cluster computing(MPI), data center design, converged and hyper-converged hardware and servers
  • Strong background on Windows & Linux administration

Responsibilities

  • Build and Lead an engineering organization focused on Rack Scale systems onboarding and Bring up execution along with external and internal partner engagement
  • Define, prioritize, and implement features, infrastructure, processes, and workflows in collaboration with Engineering, Product Management, and Customer Program Management teams
  • Collaborate with multi-functional teams to successfully deliver a reliable and robust platform from concept to prototype to deployments
  • Identify potential weaknesses in the current process and offer ideas for improvements
  • Drive overall quality of deployments and improve time to market for next gen products
  • Lead the on-ground team in collecting data on SOL deployments, physical touch information, and patterns of failure
  • Drive overall triage and recovery execution during product bring up and maintain support through product sustaining phase

Benefits

  • Multiple relocation packages
  • Two weeklong shutdowns (mid-summer and year-end) in the US (in addition to PTO)
  • 8-week parental leave
  • 9 Employee Resource Groups
  • Annual bonus offering
  • Flexible work arrangements
  • Up to 6% 401K matching
Logo

© 2024 H1BConnect. All rights reserved.

Check out our sister site LatamDev for tech jobs in Latin America! 🌎