Site Reliability Engineer *** Great company benefits *** Remote workingOur client are keen to add to their growing engineering team across Ireland by adding a Site Reliability Engineer.This is an ideal opportunity for someone interested in joining a fast-paced start-up, pioneering novel metrics, data products, and intelligence solutions, whilst offering insights into the economics, markets, usage, health, and other aspects of the FinTech space.Key Responsibilities● System Architecture and Design: Collaborate with software engineering teams to design scalable, highly available, and resilient systems. Drive architectural improvements to enhance system reliability and performance.● Implement Infrastructure as Code to manage services and deployments in a multi-cloud, multi-project configuration.● Automation and Tooling: Develop automation tools and scripts to streamline deployment, monitoring, and incident response processes. Implement and maintain infrastructure as code frameworks.● Monitoring and Alerting: Configure and maintain monitoring systems to detect and mitigate potential issues proactively. Define alerting thresholds and response procedures to ensure timely incident resolution.● Incident Management: Respond to and resolve critical incidents, perform root cause analysis, and implement preventive measures to minimize the likelihood of recurrence. Participate in an on-call rotation to provide 24/7 support as needed.● Capacity Planning and Performance Optimization: Analyze system performance metrics, identify bottlenecks, and propose optimizations to improve resource utilization and efficiency.● Security and Compliance: Work closely with security teams to implement best practices for data protection, access control, and compliance with regulatory requirements. Conduct periodic security audits and vulnerability assessments.● Documentation and Knowledge Sharing: Document system configurations, procedures, and troubleshooting steps. Share knowledge and best practices with team members to foster a culture of continuous learning and improvement.Must Have:● Proven experience in an independent contributor role working with cloud platforms: GCP, AWS, Azure, Infrastructure-as-Code tooling: Terraform, Helm, and CI/CD orchestration platforms: GitlabCI, ArgoCD, Github Actions or similar GitOps workflows.● Excellent problem-solving skills and the ability to independently troubleshoot complex issues.● Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams.● Strong Architectural & Security Mindset.Should Have:● Strong understanding of Linux/Unix systems administration and networking concepts.● Hands-on experience with configuring and running monitoring tools like Prometheus, Grafana, etc.● 5+ years experience of maintaining infrastructure-as-code on Google Cloud Platform, Amazon Web Services or Azure.● Experience working in SOC 2 Type 1 and Type 2 certified companies.Desirable:● Proficiency in scripting and programming languages such as BASH, Golang, Python and TypeScript.● 2+ years hands-on experience operating highly available Kubernetes clusters.● Experience being involved in incident management and resolution.● Experience with AI development tools and related security considerations.● Passion for the Blockchain Industry & Decentralised Systems.● Experience with Blockchain Infrastructure, either in a personal or professional capacity.If this role is of interest to you please apply now or contact Ciarán Bergin in Parker Stewart with any additional questions you may have.