JobsHPC Systems Engineer
KLA logo

HPC Systems Engineer

KLA

Location

Milpitas, CA

Type

Full-time

Posted

6/22/2026

Compensation

$159,500 - $271,200 per year

PhD with 5+ Years of Experience
Master's with 5+ Years of Experience
Undergraduate with 5+ Years of Experience
Approval 97.8%·Filings 803·New hires 321·
💎 Strong Sponsor
·FY 2025

Job description

In this role, you will lead the architecture, deployment, and operational support of a high-performance computing (HPC) cluster platform used across IC fabrication facilities and mask shops globally. You will collaborate with engineering stakeholders to gather requirements and design scalable solutions. This position requires a strong balance of systems architecture, hands-on engineering, and operational excellence in complex HPC environments. As part of the MACH team at KLA, you will be instrumental in solving complex technical problems in the semiconductor industry.

Requirements

  • Deep expertise in Linux operating systems such as SUSE, Red Hat, Rocky Linux, and Ubuntu.
  • Strong experience architecting and maintaining robust storage systems.
  • Solid understanding of HPC hardware ecosystems, including servers, GPUs, networking, storage, schedulers, BIOS, and BMC.
  • Experience with virtualization technologies such as VMware, Proxmox, or XCP-ng.
  • Strong understanding of TCP/IP fundamentals and network protocols including DNS, DHCP, HTTP, LDAP, and SMTP.
  • Experience with file sharing technologies like NFS and CIFS.
  • Proficiency in scripting and development using Shell and Python.
  • Experience with configuration management tools such as Ansible, Salt, Chef, or Puppet.
  • Experience with HPC schedulers like SGE or SLURM.
  • Minimum qualifications include a Doctorate degree with 3+ years of experience, a Master's degree with 6+ years, or a Bachelor's degree with 8+ years of related work experience.

Responsibilities

  • Design and architect scalable, high-performance HPC cluster solutions for global manufacturing environments.
  • Lead deployment, configuration, and lifecycle management of cluster infrastructure.
  • Collaborate with developers and cross-functional teams to understand requirements and translate them into technical solutions.
  • Drive solutions from design through production, including implementation, validation, and support.
  • Ensure system reliability, performance, and availability across compute, storage, and networking layers.
  • Support ongoing operations, troubleshooting, and continuous improvement of HPC systems.
  • Contribute to automation, standardization, and DevOps best practices across the platform.

Benefits

  • Employees at KLA are often offered competitive pay with bonuses, a 401(k) match, an employee stock purchase program, and financial perks like student-debt assistance, planning support, and group insurance discounts. Health and lifestyle benefits typically include medical/dental/vision, life and other voluntary coverages, paid time off and holidays, family leave, backup care, wellness rewards, gym discounts, and community-volunteering opportunities. Employees also get strong growth support through tuition reimbursement, KLA’s corporate learning center, education awards, and engineering certification programs.

Is this posting expired or inaccurate?