JobsSenior AI Infrastructure Software Engineer - DGX Cloud
Senior AI Infrastructure Software Engineer - DGX Cloud
NVIDIASenior AI Infrastructure Software Engineer - DGX Cloud
NVIDIALocation
USA (Multiple Locations)
Type
Full-time
Posted
5/13/2026
Compensation
$184,000 - $356,500 per year
Undergraduate with 5+ Years of Experience
Approval 99.2%·Filings 1,781·New hires 873·
👑 Elite Sponsor
·FY 2025Job description
As a senior AI infrastructure software engineer at NVIDIA, you will join the DGX Cloud Lepton Team, contributing to the development of a leading AI/ML platform that enhances productivity and optimizes AI workloads. The role involves designing, building, and maintaining AI platforms for large-scale training and inferencing. You will work in a dynamic environment that values learning, growth, and innovation. This position offers the opportunity to impact the future of AI while collaborating with a supportive team.
Requirements
- Minimum of 8+ years of experience in developing software infrastructure for large scale AI systems.
- Bachelor's degree or higher in Computer Science or a related technical field.
- Strong debugging skills and experience in analyzing and triaging AI applications from the application level to the hardware level.
- Proven track record in building and scaling large-scale distributed systems.
- Experience with AI training and inferencing and data infrastructure services.
- Familiarity with Kubernetes and operating large-scale observability platforms for monitoring and logging.
- Proficiency in programming languages such as Python, C/C++, and scripting languages.
- Excellent communication and collaboration skills.
Responsibilities
- Develop platform and tools for large-scale AI, LLM, and GenAI infrastructure.
- Develop and optimize tools to improve AI/ML workload efficiency and resiliency.
- Root cause, analyze, and triage failures from the application level to the hardware level.
- Enhance infrastructure and products underpinning NVIDIA's AI platforms.
- Co-design and implement APIs for integration with NVIDIA's resiliency stacks on the platform.
- Define meaningful and actionable reliability metrics to track and improve system and service reliability.
Benefits
- Employees at NVIDIA are often offered comprehensive, day-one benefits—including medical, dental, and vision coverage with HSA support, life and disability insurance, an Employee Assistance Program, and a 401(k) with auto-enrollment. Many roles also have generous time off and holidays, donation matching (up to $10,000), and a wide menu of extras like FSAs, commuter benefits, legal and identity-theft protection, pet insurance, and wellness discounts. Optional programs can include student-loan and home-purchase support, plus family care resources and expert medical services.
Is this posting expired or inaccurate?
