Site Reliability Engineer
We are seeking a Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will work in an agile, collaborative environment to build, deploy, configure, and maintain systems for our clients.
Key Responsibilities:
* 24x7 Observability: Monitoring the health of production systems and services around the clock to ensure reliability and optimal customer experience.
* Cross-Functional Troubleshooting: Collaborating with engineering teams to assess and resolve production issues effectively.
* Deployment and Configuration: Using CI/CD tools to deploy services and configuration changes at an enterprise scale.
* Security and Compliance: Implementing security measures that meet or exceed industry standards for regulations such as GDPR, SOC2, ISO 27001, PCI, HIPAA, and FBA.
* Maintenance and Support: Applying security patches and upgrades, supporting databases like Cassandra and MongoDB, and collaborating with product support teams for issue resolution.
Requirements:
* Designing, developing, and owning tooling and automation to monitor and improve availability, scalability, latency, and efficiency of secure, confidential cloud services.
* Managing infrastructure and services within IBM's Cloud ecosystem.
* Handling real-time alerts and customer-reported problems as part of a global team using a follow-the-sun model.
* Participating in scrums, sprint planning, and retrospectives; providing feedback and improvement ideas.
* Collaborating with extended IBM teams, learning new technologies, and applying new skills.
* Responding urgently to incidents, performing root cause analysis, and building a knowledge base for sharing insights.
Preferred Qualifications:
* Bachelor's Degree in Computer Science or related field.
* Experience with Linux, GitHub, Bash, Python, Node.js, Docker, Kubernetes, and Ansible.
* Developing tests and automation for routine tasks.
* Experience with REST APIs and automation.
* Proficiency in cloud logging and monitoring services.
* Strong debugging and problem-solving skills.
* Effective communication with global teams and customers.
* Team-oriented, innovative, quick learner.
About Avature
Avature is a leading provider of recruitment marketing and talent management solutions. Our platform helps companies find, engage, and hire top talent. We are committed to delivering exceptional results and providing excellent customer service.