Post your job offer for free on H1BConnect with no upfront cost!

Logo

Hire with Us
NVIDIA Corporation logo

Senior DGX Cloud AI Infrastructure Software Engineer

NVIDIA Corporation

7/18/2025

US, CA, Santa Clara

Full-time

Salary: $224,000 - $425,500 per year


Job Description

Join NVIDIA's DGX Cloud AI Efficiency Team and contribute to optimizing efficiency and resiliency of AI workloads, developing scalable AI and Data infrastructure tools and services.

Requirements

  • Minimum of 12+ years of experience in developing software infrastructure for large scale AI systems.
  • Bachelor's degree or higher in Computer Science or a related technical field (or equivalent experience).
  • Strong debugging skills and experience in analyzing and triaging AI applications from the application level to the hardware level.
  • Proven track record in building and scaling large-scale distributed systems.
  • Experience with AI training and inferencing and data infrastructure services.
  • Familiar in operating large-scale observability platforms for monitoring and logging (e.g., ELK, Prometheus, Loki).
  • Proficiency in programming languages such as Python, C/C++, script languages.
  • Excellent communication and collaboration skills.

Responsibilities

  • Develop infrastructure software and tools for large-scale AI, LLM, and GenAI infrastructure.
  • Develop and optimize tools to improve infrastructure efficiency and resiliency.
  • Root cause and analyze and triage failures from the application level to the hardware level.
  • Enhance infrastructure and products underpinning NVIDIA's AI platforms.
  • Co-design and implement APIs for integration with NVIDIA's resiliency stacks.
  • Define meaningful and actionable reliability metrics to track and improve system and service reliability.
  • Skilled in problem-solving, root cause analysis, and optimization.

Benefits

  • Multiple relocation packages
  • Two weeklong shutdowns (mid-summer and year-end) in the US (in addition to PTO)
  • 8-week parental leave
  • 9 Employee Resource Groups
  • Annual bonus offering
  • Flexible work arrangements
  • Up to 6% 401K matching
Logo

© 2024 H1BConnect. All rights reserved.

Check out our sister site LatamDev for tech jobs in Latin America! 🌎