JobsSenior System Software Engineer – Data Center GPU Compute Diagnostics
NVIDIA logo

Senior System Software Engineer – Data Center GPU Compute Diagnostics

NVIDIA

Location

Durham, NC

Type

Full-time

Posted

5/19/2026

Compensation

$224,000 - $356,500 per year

Undergraduate with 5+ Years of Experience
Approval 99.2%·Filings 1,781·New hires 873·
👑 Elite Sponsor
·FY 2025

Job description

We are looking for a senior system software engineer to develop next-generation Data Center GPU diagnostics for AI supercomputer systems. The role involves creating applications that stress GPU compute engines and related hardware components. Candidates will collaborate with various teams and mentor other engineers while leveraging their expertise in operating systems and computer architecture. This position offers an exciting and fast-paced work environment focused on innovation and validation of advanced processors.

Requirements

  • BS or MS degree in Electrical Engineering, Computer Engineering, Computer Science, or equivalent experience.
  • 12+ years of system software, GPU software, embedded software, or hardware validation experience.
  • Experience driving technical work across multiple engineers and mentoring others.
  • Strong C/C++ and Python programming skills.
  • Experience with Linux device drivers, CUDA kernels, and GPU compute workloads is strongly preferred.
  • Understanding of memory systems, ECC behavior, cache hierarchy, and hardware failure signatures.
  • Experience with voltage/frequency characterization and thermal testing.

Responsibilities

  • Work closely with hardware architecture, driver, manufacturing, and field teams throughout the product development lifecycle.
  • Craft CUDA/C++ diagnostic workloads and software infrastructure for new chip development and validation.
  • Design and implement GPU compute tests that stress Tensor Cores, SMs, and HBM memory.
  • Develop and tune GEMM-style diagnostic workloads and integrate higher-level AI workload tests.
  • Assess new hardware features and architect manufacturing and field diagnostic tests.
  • Debug failures involving ECC, HBM behavior, thermal limits, and PCIe/NVLink errors.

Benefits

  • Employees at NVIDIA are often offered comprehensive, day-one benefits—including medical, dental, and vision coverage with HSA support, life and disability insurance, an Employee Assistance Program, and a 401(k) with auto-enrollment. Many roles also have generous time off and holidays, donation matching (up to $10,000), and a wide menu of extras like FSAs, commuter benefits, legal and identity-theft protection, pet insurance, and wellness discounts. Optional programs can include student-loan and home-purchase support, plus family care resources and expert medical services.

Is this posting expired or inaccurate?