System Reliability Engineer Role
We are seeking a skilled System Reliability Engineer to join our engineering team. The successful candidate will be responsible for ensuring high availability, scalability, and performance of production, staging, and development environments.
-----------------------------------
Key Responsibilities:
* Built maintain highly available cloud infrastructure (Linux/Windows).
* Provide 24/7 production support and troubleshoot technical issues.
* Collaborate with application, DBA, and cloud teams to deliver scalable solutions.
* Implement automation and Infrastructure-as-Code (Terraform, Ansible, scripting).
* Monitor and improve system performance using tools like Prometheus, Grafana, or ELK.
* Ensure security best practices and disaster recovery processes are followed.
Requirements & Qualifications:
* 3+ years in SRE, DevOps, or Systems Administration roles.
* Hands-on experience with AWS services (EC2, S3, Lambda, VPC, IAM)
* Strong scripting/automation skills (PowerShell, Python, or similar).
* Familiarity with containerization (Docker, Kubernetes, Helm).
* Experience with multi-tier SaaS or microservices architectures.
* Good understanding of networking, load balancing, and patch management.
What We Offer:
* Opportunity to work on a leading SaaS platform.
* Chance to develop skills in automation and cloud operations.
* A dynamic and collaborative work environment.
About the Role:
* The role is ideal for those who enjoy problem-solving and working in a fast-paced environment.
* The successful candidate will have excellent communication and teamwork skills.