JobsSoftware Engineer II
Job description
The Supercomputing Software Engineer role on the AI Customer Experience team at Microsoft Azure focuses on designing and developing capabilities to monitor and operate supercomputers at scale. The team is responsible for managing flagship supercomputers that support top-tier AI customers. This position emphasizes a metrics-driven culture and aims to enhance customer satisfaction through proactive incident management and operational improvements. The engineer will create data pipelines to process telemetry and logs, ensuring efficient incident response and system performance.
Requirements
- Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python.
- Ability to meet Microsoft, customer and/or government security screening requirements.
Responsibilities
- Contribute to improving key metrics such as Job Mean Time to Interrupt, Nodes in Service, and Mean Time to Resolve on flagship supercomputers.
- Manage operations of supercomputers by responding quickly to mitigate issues.
- Implement systemic solutions and mitigations to complex issues impacting performance or functionality of supercomputers.
- Review and write incident postmortems and present insights that drive changes to reduce or eliminate incidents.
- Independently improve troubleshooting guides, wikis, tests, and telemetry, adding comprehensive observability and monitoring capabilities.
- Proactively seek new knowledge and adapt to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of supercomputers.
Benefits
- Employees at Microsoft are often offered comprehensive, “world-class” benefits—including health and mental-wellness programs, competitive pay with bonuses and stock awards, and retirement/savings options. Time-off and flexibility are common, with generous vacation and holidays, parental and caregiver leave, and flexible work schedules, alongside learning support, employee resource groups, product discounts, and matching-gifts/volunteering programs. Specific benefits can vary by region.
Is this posting expired or inaccurate?
