Requirements
This is a mid-level role for someone who is hands‑on with cloud infrastructure and eager to grow their skills across a modern, high‑traffic production environment
Pragmatic and Reliable: You take ownership of the systems you build and prioritize stability without sacrificing velocity
Curious and Hands‑On: You enjoy digging into production issues, understanding root causes, and learning how complex distributed systems behave under pressure
A Clear Communicator: You can articulate technical tradeoffs and incident timelines to both engineering peers and non‑technical stakeholders
Collaborative: You thrive working alongside software engineers and understand that great infrastructure enables great products
Growth‑Oriented: You're actively developing your skills and excited to take on increasing ownership over time
2+ Years of Experience in a DevOps, SRE, or platform/infrastructure engineering role
Solid Cloud Experience: Hands‑on with AWS services (EKS, RDS, Kinesis, S3, IAM, and related services)
Terraform Proficiency: Comfortable writing and maintaining production‑grade infrastructure as code
Kubernetes Familiarity: Experience deploying, debugging, and operating containerised workloads on Kubernetes
CI/CD Experience: Familiarity with pipeline tooling (e.g. GitHub Actions, ArgoCD, Jenkins, or similar)
Observability Skills: Experience with monitoring and alerting tools (e.g. Datadog, Prometheus/Grafana, or equivalent)
Scripting Ability: Comfortable with Python, Bash, or similar for automation and tooling tasks
A Reliability Mindset: Understanding of SRE principles — SLOs, error budgets, incident management, and blameless post‑mortems
What the job involves
As a DevOps/SRE Engineer on our Foundation team, you will help build, operate, and continuously improve the platform infrastructure that underpins all of Global‑e's microservices
You’ll work at the intersection of infrastructure engineering and reliability — automating deployments, hardening systems, and ensuring our platform scales reliably to handle millions of transactions worldwide
You’ll collaborate closely with software engineers, platform architects, and product teams to keep our systems fast, resilient, and operationally excellent
This role reports to the Engineering Manager for the Foundation team and is based out of our Dublin, Ireland office
Own Infrastructure as Code: Write, maintain, and improve Terraform modules to provision and manage cloud resources on AWS (EKS, RDS, Kinesis, and more)
Support and Improve CI/CD Pipelines: Build and maintain reliable deployment pipelines that enable engineering teams to ship with speed and confidence
Ensure Platform Reliability: Define and track SLOs/SLAs, respond to incidents, conduct post‑mortems, and drive systemic improvements to reduce toil and prevent recurrence
Monitor and Observe: Implement and maintain observability tooling — metrics, logging, alerting, and dashboards — to provide clear visibility into system health
Scale Kubernetes Workloads: Help manage and evolve our EKS clusters, ensuring workloads are performant, cost‑efficient, and fault‑tolerant
Embrace AI‑Augmented Operations: Leverage and help expand our growing use of AI tooling — from AIOps and anomaly detection to AI‑assisted incident response and infrastructure optimisation — as we invest heavily in bringing AI into our day‑to‑day engineering workflows
Collaborate Across Teams: Partner with software engineers to bridge the gap between development and operations — advising on best practices, reviewing infrastructure changes, and supporting teams during rollouts
Improve Security Posture: Contribute to hardening cloud environments, managing secrets, and enforcing least‑privilege access controls
Automate Everything: Identify manual processes and replace them with robust, repeatable automation
#J-18808-Ljbffr