Unlock New Opportunities in Cloud Engineering
We are seeking an experienced Site Reliability Engineer to help scale and support our Azure-hosted SaaS platform.
Key Responsibilities:
* Maintain high availability and reliability of cloud-based services
* Develop and enhance monitoring, alerting, and observability tools
* Automate provisioning, deployments, scaling, and incident response
* Lead incident management and drive post-incident improvements
* Build infrastructure through IaC tools such as ARM, Bicep, or Terraform
* Optimize performance and ensure compliance with ISO 27001, SOC 2, and GDPR standards
Requirements:
* Proven experience in a software product environment
* Strong background in Microsoft Azure infrastructure and services
* Proficient in scripting/automation (PowerShell preferred)
* Experience with monitoring tools (Azure Monitor, Grafana, Prometheus, Datadog)
* Knowledge of containers (Docker/Kubernetes) and CI/CD pipelines
* Skilled in incident response and root cause analysis