JobsSenior Software Development Engineer – SGLang and Inference Stack
AMD logo

Senior Software Development Engineer – SGLang and Inference Stack

AMD

Location

Santa Clara, CA

Type

Full-time

Posted

5/5/2026

Compensation

USD $178,500.00/Yr. – USD $255,000.00/Yr.

Undergraduate with 5+ Years of Experience
Approval 98.6%·Filings 728·New hires 184·
Established Sponsor
·FY 2025

Job description

The role involves optimizing and developing deep learning frameworks specifically for AMD GPUs. As a core team member, you will enhance GPU kernel performance and accelerate deep learning models, while also enabling reinforcement learning training and state-of-the-art large language model inference at scale. Collaboration with internal GPU software teams and open-source communities is essential to integrate and optimize cutting-edge compiler technologies. The position requires a skilled engineer with strong technical expertise in GPGPU C++ and related technologies.

Requirements

  • Strong technical and analytical expertise in GPGPU C++, Triton, TileLang or DSL development within Linux environments.
  • Proficient in C++ and/or Python with demonstrated ability to code, debug, profile, and optimize performance-critical code.
  • Hands-on experience with SGLang or similar LLM inference frameworks is highly preferred.
  • Background in compiler design or familiarity with technologies like LLVM, MLIR, or ROCm is a plus.
  • Experience running and scaling workloads on large-scale, heterogeneous clusters using distributed training or inference strategies.

Responsibilities

  • Enhance performance of frameworks like TensorFlow, PyTorch, and SGLang on AMD GPUs via upstream contributions in open-source repositories.
  • Profile, analyze, code change and tune large-scale training and inference models for optimal performance on AMD hardware.
  • Design, implement, and optimize high-performance GPU kernels using HIP, Triton, TileLang or other DSLs for AI operator efficiency.
  • Work closely with internal compiler and GPU math library teams to integrate, optimize and align kernel-level optimizations with full-stack performance goals.
  • Support optimization, feature development, and scaling of the SGLang framework across AMD GPU platforms for LLM, multimodal serving and RL-training.

Benefits

  • AMD provides a competitive 'Total Rewards' package that focuses on financial growth, health, and work-life balance.

Is this posting expired or inaccurate?