JobsSoftware Engineer – Distributed Systems
Job description
The Software Engineer – Distributed Systems role at IBM focuses on designing and developing large-scale, stateful distributed systems for the watsonx.data platform. The team emphasizes collaboration and innovation, working in an Agile environment to meet stakeholder requirements. Engineers will be responsible for ensuring the reliability, consistency, and performance of the infrastructure at petabyte scale. This position offers opportunities for continuous learning and career advancement within IBM's technology landscape.
Requirements
- 6+ years of professional software engineering experience, including at least 2 years designing and operating large-scale distributed systems.
- Strong skills in Java, Go, C++, or a comparable systems language.
- Hands-on knowledge of consistency models, replication, quorum systems, leader election, and consensus protocols.
- Experience designing fault-tolerant systems with automatic failover and durable recovery.
- Clear written communication skills for producing design documents and post-mortems.
Responsibilities
- Architect and implement metadata services, distributed schedulers, and state coordination layers for high-throughput data.
- Implement replication, automatic failover, and distributed consensus for correctness under failure.
- Contribute to the automated CI/CD pipeline and instrument components with structured logging and metrics.
- Design, develop, and unit test fixes for customer-reported and production issues.
- Collaborate with query engine, storage, GPU acceleration, and AI/ML teams to surface constraints early.
Benefits
- IBM offers competitive compensation, healthcare coverage, retirement programs, paid parental leave, tuition assistance, wellness programs, flexible work options, and extensive learning and certification resources.
Is this posting expired or inaccurate?
