JobsSenior Inference Engineer, AIConfigurator for Dynamo

Senior Inference Engineer, AIConfigurator for Dynamo

NVIDIA

Senior Inference Engineer, AIConfigurator for Dynamo

NVIDIA

Location

remote, Santa Clara, CA

Type

Full-time

Posted

6/13/2026

Compensation

$184,000 - $356,500 per year

Undergraduate with 5+ Years of Experience

Approval 99.2%·Filings 1,781·New hires 873·

👑 Elite Sponsor

·FY 2025

Job description

NVIDIA is seeking a Senior Inference Engineer to enhance AIConfigurator, a system designed for optimizing deployment configurations for large-scale LLM inference. This role involves integrating GPU systems and model serving while focusing on performance modeling and production software engineering. The engineer will work closely with various teams to translate performance data into actionable deployment strategies. This position is ideal for someone who enjoys managing complex technical systems and delivering practical solutions for developers and customers.

Requirements

BS, MS, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Applied Math, or a related field, or equivalent experience.
10+ years of relevant software engineering experience.
Strong Python/Rust engineering skills, including production APIs, CLI tools, packaging, testing, debugging, and maintainable software development.
Experience with GPU computing, distributed systems, ML infrastructure, or high-performance model serving.
Understanding of LLM inference concepts such as batching, latency, efficiency, memory constraints, parallelism strategies, and serving SLAs.
Experience working with data-driven performance analysis, benchmarking, simulation, optimization, or managing resource needs.
Ability to collaborate across research, runtime, platform, and customer-facing engineering teams.
Strong written and verbal communication skills, with the ability to explain sophisticated technical tradeoffs clearly.

Responsibilities

Build and evolve AIConfigurator's core optimization engine for LLM serving.
Develop production-quality Python/Rust APIs, CLIs, SDK surfaces, and web workflows.
Create configuration generation systems that produce backend-specific artifacts for various deployments.
Collaborate with inference runtime, performance, benchmarking, and product groups to ensure accurate deployment performance.
Integrate performance databases, profiling data, support matrices, and validation tools to improve model and hardware support.
Drive software quality through maintainable architecture, schema development, tests, documentation, and automation.
Convert intricate inference ideas into reliable software abstractions.

Benefits

Employees at NVIDIA are often offered comprehensive, day-one benefits—including medical, dental, and vision coverage with HSA support, life and disability insurance, an Employee Assistance Program, and a 401(k) with auto-enrollment. Many roles also have generous time off and holidays, donation matching (up to $10,000), and a wide menu of extras like FSAs, commuter benefits, legal and identity-theft protection, pet insurance, and wellness discounts. Optional programs can include student-loan and home-purchase support, plus family care resources and expert medical services.

Is this posting expired or inaccurate?