Site Reliability Engineer at Parker Stewart
Overview
This role is for a Site Reliability Engineer joining Parker Stewart.
The successful candidate will contribute to a fast-paced, data- and metrics-driven FinTech engineering environment, focusing on reliability, monitoring, and secure, scalable infrastructure.
Base pay range and additional compensation details will be discussed with the recruiter.
Direct messaging the job poster from Parker Stewart is welcome.
Responsibilities
System Architecture and Design: Collaborate with software engineering teams to design scalable, highly available, and resilient systems.
Drive architectural improvements to enhance system reliability and performance.
Implement Infrastructure as Code to manage services and deployments in a multi-cloud, multi-project configuration.
Automation and Tooling: Develop automation tools and scripts to streamline deployment, monitoring, and incident response processes.
Implement and maintain infrastructure as code frameworks.
Monitoring and Alerting: Configure and maintain monitoring systems to detect and mitigate potential issues proactively.
Define alerting thresholds and response procedures to ensure timely incident resolution.
Incident Management: Respond to and resolve critical incidents, perform root cause analysis, and implement preventive measures to minimize the likelihood of recurrence.
Participate in an on-call rotation to provide 24/7 support as needed.
Capacity Planning and Performance Optimization: Analyze system performance metrics, identify bottlenecks, and propose optimizations to improve resource utilization and efficiency.
Security and Compliance: Work closely with security teams to implement best practices for data protection, access control, and compliance with regulatory requirements.
Conduct periodic security audits and vulnerability assessments.
Documentation and Knowledge Sharing: Document system configurations, procedures, and troubleshooting steps.
Share knowledge and best practices with team members to foster a culture of continuous learning and improvement.
Must Have
Proven experience in an independent contributor role working with cloud platforms: GCP, AWS, Azure, Infrastructure-as-Code tooling: Terraform, Helm, and CI/CD orchestration platforms: GitlabCI, ArgoCD, Github Actions or similar GitOps workflows.
Excellent problem-solving skills and the ability to independently troubleshoot complex issues.
Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams.
Strong Architectural & Security Mindset.
Should Have
Strong understanding of Linux/Unix systems administration and networking concepts.
Hands-on experience with configuring and running monitoring tools like Prometheus, Grafana, etc.
5+ years experience of maintaining infrastructure-as-code on Google Cloud Platform, Amazon Web Services or Azure.
Experience working in SOC 2 Type 1 and Type 2 certified companies.
Proficiency in scripting and programming languages such as BASH, Golang, Python and TypeScript.
2+ years hands-on experience operating highly available Kubernetes clusters.
Experience being involved in incident management and resolution.
Experience with AI development tools and related security considerations.
Passion for the Blockchain Industry & Decentralised Systems.
Experience with Blockchain Infrastructure, either in a personal or professional capacity.
If this role is of interest to you please apply now or contact Ciarán Bergin in Parker Stewart with any additional questions you may have.
Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Engineering and Information Technology
Industries
Staffing and Recruiting
We're unlocking community knowledge in a new way.
Experts add insights directly into each article, started with the help of AI.
#J-*****-Ljbffr