Job Title: Senior Infrastructure Engineer
Reports to: Engineering Manager – SRE & QA
Location: Ireland – Home/Office, flexible
Purpose:
Compliance & Risks is looking for a driven and experienced Senior Infrastructure Engineer to shape the next generation of our global platform spanning both EU and US regions. You'll play a hands-on leadership role in making our systems faster, safer, more reliable, and more developer-friendly across our AWS-hosted, serverless-first environment.
This role requires strong expertise in cloud infrastructure (AWS), modern serverless architectures, observability tooling, and security best practices. You'll be responsible for infrastructure ownership, automation, developer enablement, and cross-region reliability while working closely with third-party platforms including Knock, NeonDB, Clerk, Inngest, and Vercel.
You'll work with a globally distributed engineering team. We prioritise collaboration, autonomy, and continuous learning, and we strongly support flexible working arrangements.
Key Responsibilities:
Infrastructure Support & Operations:
* Maintain and support infrastructure across AWS and our serverless stack (Lambda, API Gateway, S3, IAM)
* Manage integrations with third-party platforms (Knock, NeonDB, Clerk, Inngest, Vercel)
* Execute infrastructure deployments, rollbacks, and updates using Terraform/CloudFormation
* Monitor infrastructure health and respond to alerts proactively
* Support Kafka Connect infrastructure and troubleshoot connector issues
* Collaborate with developers to maintain reliable, performant systems
Incident Response & Support:
* Provide out-of-hours monitoring and triage as part of our support rotation (EU and out-of-hours coverage)
* Execute Vercel rollbacks and Lambda version reverts during incidents
* Coordinate incident response using our Swarm Support model
* Follow incident response workflows for Critical/High/Medium/Low severity issues
* Participate in blameless postmortems and implement preventive measures
* Maintain and update runbooks for common incident scenarios
Observability & Monitoring:
* Manage and maintain logging infrastructure (ELK Stack, LGTM platform migration)
* Build and maintain dashboards for application and infrastructure monitoring
* Configure alerts for proactive issue detection (LGTM, CloudWatch, Kafka monitoring)
* Monitor AWS CloudWatch logs for Lambda functions
* Implement metrics, logs, and tracing to improve visibility across the stack
* Create alerting strategies that balance reliability with on-call sustainability
Security & SecOps:
* Implement security best practices throughout the infrastructure stack
* Manage secrets rotation and IAM policies
* Conduct security reviews of infrastructure changes
* Support SSO and identity management systems (Clerk integrations)
* Ensure infrastructure complies with C&R's security policies and procedures including technical control design and collaboration with third party security services
* Stay current with security threats and implement preventive controls
Serverless Architecture & Optimization:
* Support and optimize Lambda functions, API Gateway, and serverless components
* Monitor and improve performance of event-driven data ingestion jobs (apps/data-sync)
* Troubleshoot Kafka connectors and data pipeline issues
* Ensure the platform performs reliably under varying loads
* Balance performance requirements with cost-effectiveness
FinOps & Cost Management:
* Monitor cloud infrastructure costs and identify optimization opportunities
* Implement cost allocation and tagging strategies
* Work with teams to balance performance, reliability, and cost efficiency
* Report on infrastructure spending and efficiency metrics
Cross-Region Reliability:
* Support infrastructure spanning EU and US regions
* Automate routine operations to reduce manual intervention
* Implement and maintain failover capabilities across regions
* Ensure consistent performance for global users
Collaboration & Documentation:
* Work closely with Engineering, QA, and Product teams
* Maintain clear documentation, runbooks, and knowledge base articles
* Participate in infrastructure and code reviews
* Share knowledge through internal documentation and training
* Communicate effectively in a remote-first, distributed team environment
Experience & Qualifications:
Required:
* 5+ years in Infrastructure, DevOps, or SRE roles supporting production systems
* Strong hands-on AWS experience (Lambda, API Gateway, ECS, S3, IAM, CloudWatch)
* Experience with infrastructure-as-code (Terraform or CloudFormation)
* Familiarity with observability tooling (ELK Stack, Datadog, Prometheus, or similar)
* Experience with CI/CD pipelines (GitHub Actions or similar)
* Understanding of serverless architectures and event-driven systems
* Security-first mindset with experience in IAM, secrets management, and compliance
* Strong troubleshooting skills and experience with incident response
* Excellent communication skills for distributed team collaboration
* Comfortable with on-call rotation and out-of-hours support responsibilities
* B.S. in Computer Science, Software Engineering, or equivalent experience
Nice-to-Have:
* Experience with Clerk, Inngest, Vercel, or similar SaaS platforms
* Familiarity with Kafka/Kafka Connect and CDC architectures
* Experience with or monorepo build systems
* Knowledge of FinOps principles and AWS cost optimization
* Understanding of compliance frameworks (ISO27001, SOC 2, GDPR)
* Experience with ZenDesk or similar support ticketing systems
* Familiarity with LGTM or Grafana observability platforms
* Experience scaling serverless architectures
How You Work:
* You communicate clearly and effectively—on calls, in reviews, and in documentation
* You're hands-on and comfortable jumping into production issues
* You take ownership of problems and see them through to resolution
* You work collaboratively and help break down barriers between teams
* You're comfortable with ambiguity and can prioritize effectively during incidents
* You share knowledge openly and help others learn
* You're pragmatic—balancing quick fixes with long-term improvements
* You're comfortable with a support rotation that includes out-of-hours coverage
About Us:
Compliance & Risks helps the world's leading brands stay in compliance with global regulations and standards. Our platform, C2P, enables smarter regulatory intelligence and product compliance management. We're passionate about solving complex problems, empowering our people, and creating a sustainable impact.
We are a global, diverse team committed to collaboration, innovation, and respect. At C&R, we embrace flexible working, value personal development, and offer a supportive culture with opportunities for growth.