JobsPrincipal / Senior GPU Software Performance Engineer — Post‑Training
Principal / Senior GPU Software Performance Engineer — Post‑Training
AMDPrincipal / Senior GPU Software Performance Engineer — Post‑Training
AMDLocation
San Jose, CA
Type
Full-time
Posted
5/5/2026
Compensation
USD $210,000.00/Yr. – USD $300,000.00/Yr.
Undergraduate with 5+ Years of Experience
Approval 98.6%·Filings 728·New hires 184·
✓ Established Sponsor
·FY 2025Job description
In this role at AMD, you will drive the performance of post-training workloads on AMD Instinct™ GPUs, focusing on optimizing training pipelines for AI applications. You will work collaboratively across various teams, including framework, compiler, and model teams, to enhance training performance and stability. The ideal candidate should be passionate about software engineering and possess strong communication skills to address complex cross-stack issues. This position emphasizes innovation and collaboration to achieve measurable improvements in training efficiency.
Requirements
- Proven GPU performance engineering experience for deep learning using ROCm/HIP, Triton, or similar technologies.
- Hands-on experience with SFT, LoRA, and RL-based training at scale.
- Strong experience with PyTorch, including torch.distributed and FSDP/ZeRO or equivalent.
- Proficiency in Python and C++, with the ability to read and write kernels as needed.
- Experience with distributed systems and collective communication libraries.
- A track record of turning profiles into fixes, upstreaming changes, and documenting results.
Responsibilities
- Lead performance optimization for finetuning and reinforcement learning training solutions on AMD GPUs.
- Improve throughput, memory efficiency, and stability across data, model, and optimizer steps.
- Optimize multi-GPU and multi-node training and communication patterns.
- Contribute efficient kernels and targeted graph-level optimizations.
- Profile, diagnose, and resolve bottlenecks using standard tooling while preventing regressions in CI.
- Ship reproducible pipelines and documentation adopted by internal teams and external developers.
- Collaborate with framework, compiler, and model teams to implement durable improvements.
Benefits
- AMD provides a competitive 'Total Rewards' package that focuses on financial growth, health, and work-life balance.
Is this posting expired or inaccurate?
