JobsSenior Systems Software Engineer, Kubernetes Node Lifecycle - DGX Cloud
NVIDIA logo

Senior Systems Software Engineer, Kubernetes Node Lifecycle - DGX Cloud

NVIDIA

Location

Santa Clara, CA, Seattle, WA

Type

Full-time

Posted

6/11/2026

Compensation

$184,000 - $356,500 per year

Undergraduate with 5+ Years of Experience
Approval 99.2%·Filings 1,781·New hires 873·
👑 Elite Sponsor
·FY 2025

Job description

The Senior Systems Software Engineer at NVIDIA will focus on Kubernetes node engineering, OS image packaging, and cloud infrastructure. This role is essential for managing the node layer within NVIDIA Kubernetes Engine (NKE) to ensure it scales effectively for DGX Cloud's objectives. The engineer will work with a team dedicated to addressing global challenges through advanced technology. The position requires deep technical expertise and a commitment to innovation in AI workloads.

Requirements

  • 8 years of experience with a background in systems software, cloud infrastructure, or Kubernetes node engineering.
  • Bachelor’s or Master’s degree in Engineering (Electrical, Computer Engineering, Computer Science) or equivalent experience.
  • Deep expertise in Cluster API (CAPI), including provider development and full machine lifecycle from provisioning to deletion.
  • Extensive experience with OS image build pipelines, node image packaging, and delivery systems for Kubernetes nodes.
  • Practical experience with bring-your-own-node models and integrating diverse hardware into live Kubernetes environments.
  • Strong understanding of kubelet configuration, node bootstrap, and the Kubernetes node registration lifecycle.
  • Experience with node image security, including vulnerability scanning, patch automation, and compliance gating.
  • Proficiency in Golang and/or Python, and hands-on experience with at least one major public cloud provider.

Responsibilities

  • Direct the building and refinement of CAPI providers for NVIDIA Kubernetes Engine.
  • Develop and maintain bring-your-own-node workflows for customer integration of NVIDIA hardware into NKE clusters.
  • Coordinate OS image generation, packaging, deployment, and update processes for NKE nodes.
  • Develop and sustain node image hardening pipelines incorporating security benchmarks and automated remediation.
  • Develop and maintain automated test suites for node images to verify accuracy across Kubernetes versions.
  • Handle nodepool lifecycle at scale, including provisioning, upgrades, and seamless node replacement.
  • Examine and resolve underlying causes of node-layer faults in production NKE clusters.
  • Partner with upstream communities to establish node provisioning and lifecycle standards.

Benefits

  • Employees at NVIDIA are often offered comprehensive, day-one benefits—including medical, dental, and vision coverage with HSA support, life and disability insurance, an Employee Assistance Program, and a 401(k) with auto-enrollment. Many roles also have generous time off and holidays, donation matching (up to $10,000), and a wide menu of extras like FSAs, commuter benefits, legal and identity-theft protection, pet insurance, and wellness discounts. Optional programs can include student-loan and home-purchase support, plus family care resources and expert medical services.

Is this posting expired or inaccurate?