Job Title: Disaster Resilience Lead
This is an exciting opportunity to lead a team of software and systems engineers in disaster resilience.
1. Lead a team on projects for users and be directly responsible for uptime.
2. Own end-to-end availability and performance of key services and build automation to prevent problem recurrence.
3. Automate response to all non-exceptional service conditions.
Requirements:
* Deep expertise in domain.
* Experience owning outcomes and decision making, solving ambiguous problems and influencing stakeholders.
* Bachelor's degree in Computer Science or equivalent practical experience.
* 8 years of experience with software development in one or more programming languages.
* 3 years of experience managing people or teams.
* 3 years of experience leading projects.
* 3 years of experience designing, analyzing, and troubleshooting distributed systems.
Benefits:
* Collaborative environment with a wide variety of backgrounds, experiences and perspectives.
* Intellectual curiosity, problem solving and openness culture.
* Self-direction to work on meaningful projects.
About the Role:
This role combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. You will ensure that Google's services have reliability, uptime appropriate to users' needs and a fast rate of improvement. Additionally, you will keep an ever-watchful eye on our systems capacity and performance.