Lead site reliability engineer

Cork

Tricentis

Site reliability engineer

€100,000 - €125,000 a year

Posted: 26 September

Offer description

Overview

As a Lead Site Reliability Engineer, you’ll be at the forefront of building scalable, resilient, and observable systems that power Tricentis SaaS products globally. This is a hands-on engineering leadership role—balancing technical delivery, process ownership, and team mentorship.

You will drive initiatives across multiple products, shape SRE standards, and serve as a trusted partner to both engineering and product leaders. You will be responsible for elevating engineering quality and reliability while enabling scale and speed.

Responsibilities

* Lead and deliver cross-cutting initiatives to improve platform scalability, resilience, and cost efficiency.
* Architect and implement cloud-native infrastructure that supports multi-region, multi-tenant deployments.
* Improve observability strategy across systems and teams—including SLOs, error budgets, and alerting standards.
* Coach and mentor engineers, guiding technical design reviews and promoting engineering excellence.
* Own post-incident analysis and ensure learning loops are completed with preventive action.
* Influence product reliability from early-stage design to production readiness reviews.
* Establish and evolve standards for deployments, operational readiness, and incident response.
* Serve as a technical advisor for engineering and product managers across the org.
* Drive architectural discussions and make decisions that influence the SRE org and wider engineering teams.
* Define and evolve technical roadmaps and execution plans aligned with company goals.
* Partner with peers in security, infrastructure, and product to drive platform-wide improvements.
* Lead incident response for high-impact outages and continuously reduce incident recurrence.
* Contribute to SRE hiring through interviews, onboarding, and process refinement.
* Guide the adoption of modern tooling and practices across teams (e.g., GitOps, self-service platforms, chaos engineering).
* Represent SRE in leadership forums, bringing insights, trade-offs, and forward-looking strategies.

Our Tech Stack

AZURE, AWS, Terraform, GitHub Actions, Kubernetes, DataDog, Prometheus, Grafana, Betterstack, All-in-one incident management platform | incident.io, Jira and more

Our Culture

We don't just preach our values; we embody them in everything we do. We are committed to creating an environment that empowers, supports, and includes individuals, where trust, transparency, creativity, curiosity, and continuous improvement thrive on a daily basis.

About You

* 6+ years of experience in SRE, Infrastructure, or DevOps roles, including technical leadership.
* Expertise in building and operating production systems in public cloud (AWS or Azure).
* Deep understanding of observability principles (SLOs, SLIs, metrics, traces, logs).
* Strong experience with infrastructure-as-code, container orchestration, and CI/CD (Terraform, K8s, GitHub Actions).
* Proven track record in leading technical projects, influencing architecture, and mentoring engineers.
* Excellent communication and cross-functional collaboration skills.
* Proactive, ownership-driven mindset with a passion for reliability and continuous improvement.

Seniority level

* Mid-Senior level

Employment type

* Full-time

Job function

* Engineering and Information Technology

Industries

* Software Development
#J-18808-Ljbffr

Apply

Create an E-mail Alert

Save

Similar job

Lead site reliability engineer

Cork

OpenText

Site reliability engineer

€104,000 - €130,878 a year

Similar job

Site reliability engineer

Cork

Apple

Site reliability engineer

€90,000 - €120,000 a year

Similar job

Sr. site reliability engineer

Cork

OpenText

Site reliability engineer