Senior On-Device Model Inference Optimization Engineer

NVIDIA

10/15/2025

Santa Clara, CA

Full-time

Salary: $184,000 - $356,500 a year

Job Description

Join NVIDIA as a Senior On-Device Model Inference Optimization Engineer to lead efforts in improving AI model performance for autonomous vehicles technology.

Requirements

MSc or PhD in Computer Science, Engineering, or related field
Over 10 years of experience in model inference and optimization
Expertise in modern machine learning frameworks like PyTorch, ONNX, and TensorRT
Strong programming skills in CUDA, Python, and C++
In-depth knowledge of optimization techniques and neural architecture search
Experience in building and deploying scalable cloud-based inference systems

Responsibilities

Develop and implement strategies to optimize AI model inference for on-device deployment
Collaborate with teams to align optimization efforts with hardware capabilities
Benchmark inference performance and implement solutions
Adapt models for diverse hardware platforms and operating systems
Create tools to validate accuracy and latency of deployed models
Recommend and implement model architecture changes

Benefits

Employees at NVIDIA are often offered comprehensive, day-one benefits—including medical, dental, and vision coverage with HSA support, life and disability insurance, an Employee Assistance Program, and a 401(k) with auto-enrollment. Many roles also have generous time off and holidays, donation matching (up to $10,000), and a wide menu of extras like FSAs, commuter benefits, legal and identity-theft protection, pet insurance, and wellness discounts. Optional programs can include student-loan and home-purchase support, plus family care resources and expert medical services.