H1BConnect Pro: Unlock advanced filters, H1B sponsorship insights, and unlimited job access.Subscribe now
AMD logo

Principal / Senior GPU Software Performance Engineer — Post‑Training

AMD
San Jose, CA Full-time 3/17/2026 $226.4k - $339.6k per year
Master's Entry-Level
Approval 98.6%Total filings 728New hires 184
Established Sponsor
FY 2025

Job Description

The Principal/Senior GPU Software Performance Engineer at AMD will focus on enhancing the performance of post-training workloads on AMD Instinct GPUs. The role involves optimizing various components of training pipelines, collaborating with multiple teams, and ensuring reproducibility and efficiency in deep learning processes.

Requirements

  • Proven GPU performance engineering for deep learning (ROCm/HIP, Triton, or similar)
  • Hands-on with SFT, LoRA, and RL-based training at scale
  • Strong PyTorch experience (torch.distributed, FSDP/ZeRO or equivalent)
  • Proficient in Python and C++; comfortable reading/writing kernels when needed
  • Experience with distributed systems and collective communication libraries
  • Track record of turning profiles into fixes, upstreaming changes, and documenting results

Responsibilities

  • Lead performance for finetuning and RL training solutions on AMD GPUs
  • Improve throughput, memory efficiency, and stability across data, model, and optimizer steps
  • Optimize multi-GPU/multi-node training and communication patterns
  • Contribute efficient kernels/ops and targeted graph-level optimizations
  • Profile, diagnose, and resolve bottlenecks using standard tooling; prevent regressions in CI
  • Ship reproducible pipelines and documentation adopted by internal teams and external developers
  • Collaborate with framework, compiler, and model teams to land durable improvements

Benefits

  • AMD provides a competitive 'Total Rewards' package that focuses on financial growth, health, and work-life balance.

Is this job posting expired or no longer available?