JobsML Kernel Performance Engineer, Edge AI and Science

ML Kernel Performance Engineer, Edge AI and Science

Amazon

ML Kernel Performance Engineer, Edge AI and Science

Amazon

Location

Sunnyvale, CA

Type

Full-time

Posted

6/24/2026

Compensation

$165,200 - $223,600 per year

Undergraduate with 2+ Years of Experience

Approval 98.6%·Filings 19,451·New hires 10,113·

👑 Elite Sponsor

·FY 2025

Job description

The ML Kernel Performance Engineer will work at the intersection of hardware and software to optimize CUDA and Triton kernels for Amazon's neural network compression platform. This role focuses on enhancing the performance of compression algorithms during training, fine-tuning, and inference. The engineer will collaborate with scientists and platform engineers to ensure efficient execution of novel quantization schemes and sparse computation patterns. The position is part of a small, agile team dedicated to advancing edge AI capabilities for Amazon's consumer products.

Requirements

3+ years of non-internship professional software development experience
2+ years of non-internship design or architecture experience
Knowledge of Python and/or C++ programming
Experience with CUDA kernels or ML/low-level kernels
Bachelor's degree in computer science or equivalent
3+ years of full software development life cycle experience
Experience with GPU kernel optimization and GPGPU computing
Proficiency in low-level performance optimization for GPUs
Understanding of GPU memory hierarchies and optimization strategies
Experience developing high-performance libraries for ML or HPC applications
Knowledge of ML frameworks and their GPU backends
Experience implementing custom PyTorch operators
Experience with parallel programming and optimization techniques
Background in neural network compression
Knowledge of mixed-precision training and inference
Experience with inference optimization
Familiarity with Transformer architectures and their compute/memory profiles
Experience with AWS Trainium/Inferentia or the Neuron Kernel Interface
Experience with edge deployment, model compilation, or hardware-aware optimization

Responsibilities

Design and implement high-performance CUDA and Triton kernels for quantization-aware training and low-bit inference.
Analyze and optimize kernel-level performance for compression training workloads.
Implement kernel-level optimizations such as operator fusion and memory access pattern optimization.
Build a kernel development harness that enables team members to profile kernel performance.
Maintain and extend the team's training kernels library with clean interfaces and examples.
Collaborate closely with Applied Scientists and hardware architects to co-design ML-centric solutions.
Develop inference kernels for cloud deployment that optimize memory usage.
Build and maintain performance regression tests and benchmarking infrastructure.

Benefits

Employees at Amazon are often offered comprehensive health benefits—including multiple medical plan options (no pre-existing condition exclusions, 100% covered in-network preventive care), dental and vision plans, a 24/7 medical advice line from day one, expert second-opinion services, and broad mental-health support with several free counseling sessions (including pediatric). Financial wellness typically includes a 401(k) with company match (up to 2%), Restricted Stock Units (equity), FSAs, an emergency savings program, product and partner discounts, and even college-savings and home-purchase programs. Overall, the package is designed to support employees and their families’ health, finances, and day-to-day life.

Is this posting expired or inaccurate?