JobsSite Reliability Engineer (Edge Services), Infrastructure Services

Site Reliability Engineer (Edge Services), Infrastructure Services

Apple

Site Reliability Engineer (Edge Services), Infrastructure Services

Apple

Location

USA (Multiple Locations)

Type

Full-time

Posted

5/21/2026

Compensation

Not listed

Undergraduate with 2+ Years of Experience

Approval 98.9%·Filings 5,543·New hires 2,691·

👑 Elite Sponsor

·FY 2025

Job description

We are looking for a proactive Site Reliability Engineer to enhance our production ecosystems by developing a sophisticated reliability framework. This role involves bridging complex distributed systems with seamless user experiences while focusing on automation and observability. As a member of the SRE team, you will prioritize high-cardinality data and meaningful signals to improve system resilience and scalability. Your mission will be to shift operations from reactive to proactive, ensuring performance bottlenecks are addressed before impacting customers.

Requirements

Strong understanding of Linux internals and deep networking expertise, including HTTP/2, HTTP/3, and HTTPS/TLS.
Proven ability to automate repetitive tasks and complex workflows using Python or Go.
Experience configuring and managing modern monitoring suites such as Prometheus, Grafana, or ClickHouse.
Knowledge of Data Structures and Algorithms to write efficient code and troubleshoot system bottlenecks.
Practical knowledge of SLIs, SLOs, Error Budgets, Release Management, and Incident Management.
Experience managing cloud environments like AWS, GCP, or Azure using Terraform, Ansible, or Pulumi.
Hands-on experience with Kubernetes for scaling and securing containerized workloads.
Ability to lead blameless post-mortems and use insights to improve system reliability.
Consultative skills to work with product teams on service design for long-term maintainability.
Fluency in applying Generative AI tools within SRE and software engineering workflows.

Responsibilities

Champion the evolution of production ecosystems by developing a data-driven reliability framework.
Design and implement a next-generation observability and alerting strategy.
Build self-healing systems and reduce toil through automation.
Partner with development teams to integrate reliability into the CI/CD pipeline.
Identify and mitigate performance bottlenecks proactively.
Consult with product teams to enhance service design for maintainability.
Lead blameless post-mortems to strengthen systems against future failures.
Apply Generative AI tools to improve observability and debugging workflows.

Benefits

Employees at Apple are often offered comprehensive benefits that support physical and mental well-being—flexible medical plans, confidential counseling, onsite wellness centers at major campuses, and resources for fitness and daily life. Families typically receive fertility support, paid parental leave with gradual return, caregiving leave, and dependent-care guidance, while financial perks commonly include stock grants (with purchase discounts), 401(k) matching, and income-protection coverage. Employees also see robust time off, Apple University learning and tuition reimbursement, donation matching and paid volunteer hours, and deep product and partner discounts.

Is this posting expired or inaccurate?