Job Title:
Site Reliability Engineer Leader
About the Role
We seek a seasoned Site Reliability Engineer to join our team and help us drive business success. As a critical member of our organization, you will be responsible for ensuring the reliability and stability of our applications and platforms.
Key Responsibilities
• Conduct resiliency design reviews to identify potential issues and implement solutions.
• Collaborate with cross-functional teams to manage incident response and minimize business impact.
• Automate security controls, governance processes, and compliance validation on AWS.
• Design and maintain tools to automate operational processes on cloud platforms.
Required Skills and Qualifications
• Strong knowledge of software engineering concepts and proficiency in at least one programming language (e.g., Python, Java Spring Boot, Go, Shell Script).
• Deep understanding of reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices.
• Experience with observability tools (e.g., Grafana, Dynatrace, Prometheus, Datadog, Splunk) and ability to implement them within an application or platform.
• Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform).
• Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker).
• Ability to expand and collaborate across different levels and stakeholder groups.
Benefits
• Opportunity to work with cutting-edge technologies and contribute to driving business success.
• Collaborative and dynamic work environment that encourages innovation and growth.
• Professional development opportunities to enhance your skills and expertise.
What We Offer
Our organization is committed to fostering a culture of diversity, equity, and inclusion. We strive to create an inclusive work environment where all employees feel valued, respected, and empowered to succeed. Our benefits package includes competitive compensation, comprehensive health insurance, and access to professional development opportunities.