JobsSenior Reliability Engineer - AV Labs
Job description
The Senior Reliability Engineer role at Uber focuses on enhancing the reliability of sensor and hardware systems within a large, distributed fleet. This position emphasizes maximizing sensor uptime and data yield while addressing issues caused by various factors such as failing sensors and environmental conditions. The engineer will be responsible for building observability systems and automating responses to ensure operational efficiency. This senior role requires strong software engineering skills and the ability to lead technical strategies across teams.
Requirements
- 5+ years of relevant industry experience in software engineering, site reliability, or systems engineering.
- Experience with modern observability platforms in edge, IoT, or hardware-integrated environments.
- Proficiency in coding with Go, Python, or C++ and experience building production systems.
- Strong understanding of Linux internals and shell scripting for debugging edge devices.
- Proven track record in owning reliability, infrastructure, or platform systems for large-scale production workloads.
- Experience designing and operating observability systems including metrics, logging, and alerting.
- Ability to define and implement SLIs and SLOs for system availability or data yield.
- Deep understanding of networking protocols and data handling in bandwidth-constrained environments.
- Experience driving complex technical projects and architectural reviews across multiple teams.
Responsibilities
- Design and scale an observability platform for real-time health telemetry from vehicle nodes.
- Develop systems that perform well despite hardware diversity and intermittent connectivity.
- Establish alerting strategies to differentiate between transient anomalies and systemic issues.
- Design detection logic for silent failures such as sensor degradation or recording pipeline stalls.
- Create automated detection, triage, and mitigation mechanisms to reduce manual intervention.
- Collaborate with Operations and Engineering to build automated responses to hardware and software failures.
- Build technical interfaces to help Operations surface issues and Engineering diagnose problems quickly.
- Drive reliability-focused design reviews and translate operational pain points into technical requirements.
- Apply advanced data analytics to identify patterns in fleet telemetry for proactive detection of issues.
Benefits
- Employees at Uber are often offered comprehensive health, life, disability, and mental wellness benefits, along with wellbeing stipends, travel medical coverage, and monthly Uber credits for Rides and Eats. Employees also get generous paid parental leave, flexible time off, and family-planning support so they can care for themselves and their families at every stage.
Is this posting expired or inaccurate?
