JobsSystems Quality and Reliability Engineer - LPU
NVIDIA logo

Systems Quality and Reliability Engineer - LPU

NVIDIA

Location

Santa Clara, CA

Type

Full-time

Posted

5/27/2026

Compensation

$136,000 - $264,500 per year

Undergraduate with 2+ Years of Experience
Approval 99.2%·Filings 1,781·New hires 873·
👑 Elite Sponsor
·FY 2025

Job description

The Systems Quality and Reliability Engineer will join NVIDIA's LPU team, focusing on the RMA and FA debug and root-cause analysis for AI/ML products. This role involves collaboration with various engineering teams to enhance product reliability and quality. The engineer will analyze data trends and manage operational performance to ensure quality standards are met. The position is integral to maintaining NVIDIA's commitment to innovation and excellence in technology.

Requirements

  • BS/MS in Electrical Engineering, Physics, or a related degree or equivalent experience
  • 5+ years of hands-on systems test and/or validation engineering experience
  • Proven hands-on experience in systems quality and reliability engineering
  • Competence using lab equipment such as oscilloscopes, logic analyzers, and power analyzers
  • Experience with reliability tests such as HTOL and quality tests such as Burn in
  • Working knowledge of FA techniques and tools such as FIB, SEM, TDR, VNA, and CSAM
  • Strong knowledge of fault isolation techniques such as OBIRCH, DLS/LADA, LVP, and LVI
  • Proficiency with high-speed interfaces like SerDes, PCIe, and DDR
  • Proficiency in programming languages such as Python, PERL, and C++ on UNIX/Linux
  • Excellent knowledge of PCB card and system-level test and debug

Responsibilities

  • Own, build, and manage the RMA and FA debug and root-cause analysis for NVIDIA AI/ML products
  • Conduct and lead debug and root-cause analysis of field RMAs
  • Collaborate with Systems Engineers, Hardware engineers, Software engineers, and operations engineers as required
  • Scale root cause FA capabilities within the organization
  • Create FA result reports that align with standard 8D or similar processes
  • Analyze RMA, FA, and repair data to identify trends and raise quality alerts when necessary
  • Drive resolution, containment, and mitigation plans for quality alerts
  • Oversee hardware quality performance by monitoring field quality data and associated metrics
  • Manage operational performance of FA at contract manufacturers to achieve key performance indicators
  • Oversee the setup of new products into Failure Analysis operations

Benefits

  • Employees at NVIDIA are often offered comprehensive, day-one benefits—including medical, dental, and vision coverage with HSA support, life and disability insurance, an Employee Assistance Program, and a 401(k) with auto-enrollment. Many roles also have generous time off and holidays, donation matching (up to $10,000), and a wide menu of extras like FSAs, commuter benefits, legal and identity-theft protection, pet insurance, and wellness discounts. Optional programs can include student-loan and home-purchase support, plus family care resources and expert medical services.

Is this posting expired or inaccurate?