JobsSenior Software Development Engineer – SGLang and Inference Stack
Senior Software Development Engineer – SGLang and Inference Stack
AMDSenior Software Development Engineer – SGLang and Inference Stack
AMDLocation
Santa Clara, CA
Type
Full-time
Posted
5/5/2026
Compensation
USD $178,500.00/Yr. – USD $255,000.00/Yr.
Undergraduate with 5+ Years of Experience
Approval 98.6%·Filings 728·New hires 184·
✓ Established Sponsor
·FY 2025Job description
The role involves optimizing and developing deep learning frameworks specifically for AMD GPUs. As a core team member, you will enhance GPU kernel performance and accelerate deep learning models, while also enabling reinforcement learning training and state-of-the-art large language model inference at scale. Collaboration with internal GPU software teams and open-source communities is essential to integrate and optimize cutting-edge compiler technologies. The position requires a skilled engineer with strong technical expertise in GPGPU C++ and related technologies.
Requirements
- Strong technical and analytical expertise in GPGPU C++, Triton, TileLang or DSL development within Linux environments.
- Proficient in C++ and/or Python with demonstrated ability to code, debug, profile, and optimize performance-critical code.
- Hands-on experience with SGLang or similar LLM inference frameworks is highly preferred.
- Background in compiler design or familiarity with technologies like LLVM, MLIR, or ROCm is a plus.
- Experience running and scaling workloads on large-scale, heterogeneous clusters using distributed training or inference strategies.
Responsibilities
- Enhance performance of frameworks like TensorFlow, PyTorch, and SGLang on AMD GPUs via upstream contributions in open-source repositories.
- Profile, analyze, code change and tune large-scale training and inference models for optimal performance on AMD hardware.
- Design, implement, and optimize high-performance GPU kernels using HIP, Triton, TileLang or other DSLs for AI operator efficiency.
- Work closely with internal compiler and GPU math library teams to integrate, optimize and align kernel-level optimizations with full-stack performance goals.
- Support optimization, feature development, and scaling of the SGLang framework across AMD GPU platforms for LLM, multimodal serving and RL-training.
Benefits
- AMD provides a competitive 'Total Rewards' package that focuses on financial growth, health, and work-life balance.
Is this posting expired or inaccurate?
