Senior Cloud Reliability Engineer
This role is responsible for ensuring the reliability, scalability, and performance of cloud-based systems. We are looking for an experienced individual who can take ownership of automation, monitoring, and incident response.
The ideal candidate will have strong expertise with cloud computing platforms, experience with automation tools, and knowledge of infrastructure as code (IaC) concepts.
Main Responsibilities:
* Design, implement, and maintain highly available and secure cloud systems.
* Develop automation scripts and IaC configurations to improve system efficiency and reduce downtime.
* Configure and maintain monitoring, alerting, and observability tools to ensure system performance and availability.
* Support and improve continuous integration/continuous deployment (CI/CD) pipelines for reliable and efficient deployments.
* Lead incident and problem management, including root cause analysis and post-incident reviews.
* Ensure alignment with security and compliance best practices.
Required Skills and Qualifications:
* Proven experience in cloud-based SRE or similar roles.
* Strong expertise with cloud computing platforms, such as Azure.
* Hands-on experience with automation tools, PowerShell scripting, and IaC concepts.
* Working knowledge of Docker, Kubernetes, and cloud-native architectures.
* Experience with monitoring and observability solutions.
* Excellent analytical, communication, and collaboration skills.
Benefits:
* Azure certifications (Administrator Associate or Solutions Architect Expert).
* Experience with Azure SQL, Cosmos DB, or PostgreSQL.
* Familiarity with agile delivery environments.
Additional Information:
For more information, please get in touch with our recruitment team.