JobsPrincipal AI Inference Systems Engineer
Job description
AMD is seeking a Senior Staff AI Infra Engineer to enhance the performance of AI/ML workloads and GPU-accelerated computing. The role is part of the Llama team, which focuses on generative AI and making advanced knowledge accessible to developers. The ideal candidate will lead technical initiatives and work at the intersection of hardware and software to optimize next-generation AI applications. This position requires strong leadership skills and a passion for software engineering.
Requirements
- 5+ years of experience in AI/ML infrastructure, distributed systems, or performance-critical software development.
- Expert-level proficiency in C/C++ and Python.
- Solid understanding of transformer-based architectures and distributed training frameworks such as Megatron-LM, DeepSpeed, and PyTorch Distributed.
- Proven experience optimizing LLM training and inference pipelines, including TP/PP/DP/ZeRO parallelism, quantization, and mixed-precision techniques.
- Hands-on experience designing, building, and scaling training or inference platforms using Kubernetes, Ray, or Kubeflow.
- Familiarity with GPU architecture and distributed communication libraries (e.g., NCCL, RCCL, MPI).
- Experience with profiling and performance-analysis tools for GPU optimization and system-level debugging.
Responsibilities
- Lead technical initiatives and provide architectural guidance for AI/ML infrastructure and performance optimization.
- Optimize and accelerate LLM training and inference on AMD GPUs, improving kernel, communication, and end-to-end system efficiency.
- Develop and enhance infrastructure supporting LLMs, Agentic AI, and RAG systems.
- Design, build, and optimize AI workloads on GPU clusters, including large-scale training and inference orchestration.
- Debug and resolve complex system-level performance issues across GPU, network, and runtime layers.
- Drive technical excellence, foster cross-team collaboration, and champion innovation within the organization.
Benefits
- AMD provides a competitive 'Total Rewards' package that focuses on financial growth, health, and work-life balance.
Is this posting expired or inaccurate?
