H1BConnect Pro is launching with premium alerts and access to more job postings.Get early access
Netflix logo

Engineering Manager, Machine Learning, Model Evaluations and Data Curation (AI Foundations)

Netflix
USA Full-time 12/10/2025
Master's with 2+ Years of Experience

Job Description

Netflix is seeking an engineering leader to build and scale a team focused on model evaluations and data curation for large language models (LLMs) and generative AI. This role involves leading AI/ML researchers and engineers to enhance personalization and discovery through rigorous evaluation methodologies and data systems.

Requirements

  • Experience building and leading high-performing teams of ML researchers and engineers.
  • Proven track record of leading machine learning initiatives from research to production, ideally involving evaluation frameworks, ML infrastructure, or data-intensive systems.
  • Strong technical expertise in LLMs, their evaluation, and practical methods for ensuring robustness, reproducibility, and quality.
  • Broad knowledge of machine learning fundamentals and evaluation methodologies, including benchmark design, model-based evaluators, and offline/online metrics.
  • Experience driving cross-functional projects, including close collaboration with AI application teams to translate product needs into evaluation frameworks.
  • Excellent written and verbal communication skills, able to bridge technical and non-technical audiences.
  • Advanced degree in Computer Science, Statistics, or a related quantitative field.

Responsibilities

  • Partner with downstream AI application teams to define shared evaluations that codify application expectations of LLMs and other foundation models.
  • Design rigorous benchmarks and evaluation methodologies across ranking & recommendations, content understanding, and language/text generation.
  • Lead the development of evaluators and strong baselines to ensure in-house LLMs and other foundation models demonstrate clear advantages over off-the-shelf alternatives.
  • Build scalable, reproducible data and evaluation systems that make dataset creation and evaluation design as nimble and experiment-friendly as model development itself.
  • Hire, grow, and nurture a world-class team, fostering an inclusive, high-performing culture that balances research innovation with engineering excellence.
  • Work closely with the teams developing Netflix’s foundation models to ensure evaluation and data insights are folded back into the cadence of model development.

Benefits

  • Employees at Netflix are often offered flexible, people-first benefits—unlimited time away, generous parental leave, global family-forming support, mental-health programs (mindfulness, free counseling/coaching), and health coverage tailored by country. Financially, Netflix pays at personal top-of-market and lets employees choose their mix of cash vs. fully-vested 10-year stock options, alongside donation and volunteer matching. Convenience perks can include trust-based travel/expense policies, relocation support, and “Work, Not Drive” rideshare flexibility.