Role: SRE
Work Location: Dublin, Ireland(hybrid).
Type: Permanent
Description:
As a
SRE Engineer
, you will act as the production readiness steward for versatile Gateway products and integration with other platforms. You will partner with development teams to design, implement, and support services with a focus on operational resilience, automation, and compliance.
Key Responsibilities:
* Lifecycle Ownership:
Engage in and improve the entire service lifecycle—from design and deployment to operations and continuous improvement.
* Operational Readiness:
Ensure system availability, capacity, performance, monitoring, and self-healing capabilities are embedded throughout delivery.
* Incident Management:
Practice sustainable incident response, lead blameless postmortems, and optimize Mean Time to Recovery (MTTR).
* Automation & CI/CD:
* Develop and maintain automation pipelines for certificate renewal, traffic routing, alerting, and compliance reporting using tools like
Ansible, Venafi & XLR template
.
* Support CI/CD pipelines for software promotion and operational gating.
* Reliability Engineering:
Scale systems sustainably through automation and advocate for changes that improve reliability and velocity.
* Compliance & Risk Management:
Drive initiatives for Safety & Soundness, PCI compliance, threat/toil reduction, and ITSM defect resolution.
* Monitoring & Observability:
Implement robust logging, monitoring, and alerting standards to ensure system health and proactive issue detection. Hands-on experience with
Dynatrace & Splunk
monitoring tool configuration and alerting.
* Collaboration:
Work with global teams across multiple time zones and mentor junior engineers.
* Continuous Improvement:
Provide feedback loops to development teams on resiliency gaps and operational enhancements.
* Rotational On-Call & Flexibility:
* Participate in rotational on-call support for critical production systems.
* Demonstrate flexibility to take on additional responsibilities and ad-hoc duties as needed to support team and organizational goals.
All About You (Skills & Qualifications)
* Experience:
5+ years in BizOps, Site Reliability Engineering, or DevOps roles.
Technical Expertise:
* Strong understanding of
NGINX configuration
and
gRPC event-driven architectures
.
* Proficiency in DevOps tools:
Chef, Jenkins, Groovy, shell scripting, Bitbucket, Git, Ansible, XLR
.
* Experience with
AWS infrastructure
, secure access practices, and cloud-native deployments.
Security & Compliance:
* Awareness of certificate lifecycle management, mutual TLS, SSL handshake, SSH keys, encryption standards.
* Familiarity with ITSM processes, compliance frameworks, and incident management.
Networking & Systems:
* Knowledge of client-server relationships, network layers (L1–L7), load balancers (
BIG-IP F5
), and application firewalls.
* Ability to analyze stack traces, TCP dumps, heap/thread dumps, and perform OS-level troubleshooting.
Authentication & Authorization:
* Intermediate understanding of
Active Directory, SAML, LTPA, SSO, OAuth
.
Soft Skills:
* Strong documentation and communication skills.
Ability to collaborate across cross-functional teams and mentor junior resources.