JobsML Kernel Performance Engineer, Edge AI and Science
Amazon logo

ML Kernel Performance Engineer, Edge AI and Science

Amazon

Location

Sunnyvale, CA

Type

Full-time

Posted

6/24/2026

Compensation

$165,200 - $223,600 per year

Undergraduate with 2+ Years of Experience
Approval 98.6%·Filings 19,451·New hires 10,113·
👑 Elite Sponsor
·FY 2025

Job description

The ML Kernel Performance Engineer will work at the intersection of hardware and software to optimize CUDA and Triton kernels for Amazon's neural network compression platform. This role focuses on enhancing the performance of compression algorithms during training, fine-tuning, and inference. The engineer will collaborate with scientists and platform engineers to ensure efficient execution of novel quantization schemes and sparse computation patterns. The position is part of a small, agile team dedicated to advancing edge AI capabilities for Amazon's consumer products.

Requirements

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture experience
  • Knowledge of Python and/or C++ programming
  • Experience with CUDA kernels or ML/low-level kernels
  • Bachelor's degree in computer science or equivalent
  • 3+ years of full software development life cycle experience
  • Experience with GPU kernel optimization and GPGPU computing
  • Proficiency in low-level performance optimization for GPUs
  • Understanding of GPU memory hierarchies and optimization strategies
  • Experience developing high-performance libraries for ML or HPC applications
  • Knowledge of ML frameworks and their GPU backends
  • Experience implementing custom PyTorch operators
  • Experience with parallel programming and optimization techniques
  • Background in neural network compression
  • Knowledge of mixed-precision training and inference
  • Experience with inference optimization
  • Familiarity with Transformer architectures and their compute/memory profiles
  • Experience with AWS Trainium/Inferentia or the Neuron Kernel Interface
  • Experience with edge deployment, model compilation, or hardware-aware optimization

Responsibilities

  • Design and implement high-performance CUDA and Triton kernels for quantization-aware training and low-bit inference.
  • Analyze and optimize kernel-level performance for compression training workloads.
  • Implement kernel-level optimizations such as operator fusion and memory access pattern optimization.
  • Build a kernel development harness that enables team members to profile kernel performance.
  • Maintain and extend the team's training kernels library with clean interfaces and examples.
  • Collaborate closely with Applied Scientists and hardware architects to co-design ML-centric solutions.
  • Develop inference kernels for cloud deployment that optimize memory usage.
  • Build and maintain performance regression tests and benchmarking infrastructure.

Benefits

  • Employees at Amazon are often offered comprehensive health benefits—including multiple medical plan options (no pre-existing condition exclusions, 100% covered in-network preventive care), dental and vision plans, a 24/7 medical advice line from day one, expert second-opinion services, and broad mental-health support with several free counseling sessions (including pediatric). Financial wellness typically includes a 401(k) with company match (up to 2%), Restricted Stock Units (equity), FSAs, an emergency savings program, product and partner discounts, and even college-savings and home-purchase programs. Overall, the package is designed to support employees and their families’ health, finances, and day-to-day life.

Is this posting expired or inaccurate?