JobsSenior Inference Engineer, AIConfigurator for Dynamo
Senior Inference Engineer, AIConfigurator for Dynamo
NVIDIASenior Inference Engineer, AIConfigurator for Dynamo
NVIDIALocation
remote, Santa Clara, CA
Type
Full-time
Posted
6/13/2026
Compensation
$184,000 - $356,500 per year
Undergraduate with 5+ Years of Experience
Approval 99.2%·Filings 1,781·New hires 873·
👑 Elite Sponsor
·FY 2025Job description
NVIDIA is seeking a Senior Inference Engineer to enhance AIConfigurator, a system designed for optimizing deployment configurations for large-scale LLM inference. This role involves integrating GPU systems and model serving while focusing on performance modeling and production software engineering. The engineer will work closely with various teams to translate performance data into actionable deployment strategies. This position is ideal for someone who enjoys managing complex technical systems and delivering practical solutions for developers and customers.
Requirements
- BS, MS, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Applied Math, or a related field, or equivalent experience.
- 10+ years of relevant software engineering experience.
- Strong Python/Rust engineering skills, including production APIs, CLI tools, packaging, testing, debugging, and maintainable software development.
- Experience with GPU computing, distributed systems, ML infrastructure, or high-performance model serving.
- Understanding of LLM inference concepts such as batching, latency, efficiency, memory constraints, parallelism strategies, and serving SLAs.
- Experience working with data-driven performance analysis, benchmarking, simulation, optimization, or managing resource needs.
- Ability to collaborate across research, runtime, platform, and customer-facing engineering teams.
- Strong written and verbal communication skills, with the ability to explain sophisticated technical tradeoffs clearly.
Responsibilities
- Build and evolve AIConfigurator's core optimization engine for LLM serving.
- Develop production-quality Python/Rust APIs, CLIs, SDK surfaces, and web workflows.
- Create configuration generation systems that produce backend-specific artifacts for various deployments.
- Collaborate with inference runtime, performance, benchmarking, and product groups to ensure accurate deployment performance.
- Integrate performance databases, profiling data, support matrices, and validation tools to improve model and hardware support.
- Drive software quality through maintainable architecture, schema development, tests, documentation, and automation.
- Convert intricate inference ideas into reliable software abstractions.
Benefits
- Employees at NVIDIA are often offered comprehensive, day-one benefits—including medical, dental, and vision coverage with HSA support, life and disability insurance, an Employee Assistance Program, and a 401(k) with auto-enrollment. Many roles also have generous time off and holidays, donation matching (up to $10,000), and a wide menu of extras like FSAs, commuter benefits, legal and identity-theft protection, pet insurance, and wellness discounts. Optional programs can include student-loan and home-purchase support, plus family care resources and expert medical services.
Is this posting expired or inaccurate?
