JobsHPC Systems Engineer
Location
Milpitas, CA
Type
Full-time
Posted
6/22/2026
Compensation
$159,500 - $271,200 per year
PhD with 5+ Years of Experience
Master's with 5+ Years of Experience
Undergraduate with 5+ Years of Experience
Approval 97.8%·Filings 803·New hires 321·
💎 Strong Sponsor
·FY 2025Job description
In this role, you will lead the architecture, deployment, and operational support of a high-performance computing (HPC) cluster platform used across IC fabrication facilities and mask shops globally. You will collaborate with engineering stakeholders to gather requirements and design scalable solutions. This position requires a strong balance of systems architecture, hands-on engineering, and operational excellence in complex HPC environments. As part of the MACH team at KLA, you will be instrumental in solving complex technical problems in the semiconductor industry.
Requirements
- Deep expertise in Linux operating systems such as SUSE, Red Hat, Rocky Linux, and Ubuntu.
- Strong experience architecting and maintaining robust storage systems.
- Solid understanding of HPC hardware ecosystems, including servers, GPUs, networking, storage, schedulers, BIOS, and BMC.
- Experience with virtualization technologies such as VMware, Proxmox, or XCP-ng.
- Strong understanding of TCP/IP fundamentals and network protocols including DNS, DHCP, HTTP, LDAP, and SMTP.
- Experience with file sharing technologies like NFS and CIFS.
- Proficiency in scripting and development using Shell and Python.
- Experience with configuration management tools such as Ansible, Salt, Chef, or Puppet.
- Experience with HPC schedulers like SGE or SLURM.
- Minimum qualifications include a Doctorate degree with 3+ years of experience, a Master's degree with 6+ years, or a Bachelor's degree with 8+ years of related work experience.
Responsibilities
- Design and architect scalable, high-performance HPC cluster solutions for global manufacturing environments.
- Lead deployment, configuration, and lifecycle management of cluster infrastructure.
- Collaborate with developers and cross-functional teams to understand requirements and translate them into technical solutions.
- Drive solutions from design through production, including implementation, validation, and support.
- Ensure system reliability, performance, and availability across compute, storage, and networking layers.
- Support ongoing operations, troubleshooting, and continuous improvement of HPC systems.
- Contribute to automation, standardization, and DevOps best practices across the platform.
Benefits
- Employees at KLA are often offered competitive pay with bonuses, a 401(k) match, an employee stock purchase program, and financial perks like student-debt assistance, planning support, and group insurance discounts. Health and lifestyle benefits typically include medical/dental/vision, life and other voluntary coverages, paid time off and holidays, family leave, backup care, wellness rewards, gym discounts, and community-volunteering opportunities. Employees also get strong growth support through tuition reimbursement, KLA’s corporate learning center, education awards, and engineering certification programs.
Is this posting expired or inaccurate?
