Jobs
My ads
My job alerts
Sign in
Find a job Employers
Find

Lead, site reliability engineer (infrastructure operations)

Dublin
Mastercard
Site reliability engineer
Posted: 7 May
Offer description

Title and Summary
Lead, Site Reliability Engineer (Infrastructure Operations)
Lead SRE Engineer, Site Reliability Engineering
About the Role
Mastercard’s Program aligned Site Reliability Engineering (SRE) teams are dedicated to delivering a seamless experience for our customers. We achieve this by maintaining every aspect of our Programs infrastructure and technology ecosystem to the highest standards, ensuring compliance with rigorous security requirements.
Within Mastercard, SRE focuses on the reliability and performance of core infrastructure, networks, and foundational services that power our applications. Our mission is to ensure these components operate with excellence, enabling applications to deliver an outstanding customer experience.
In this role, you will join our Payments Network SRE team and take ownership of continuously assessing and elevating the end‑to‑end service quality of our platform. You will leverage data to drive root cause analysis and deliver strategic insights to key stakeholders on resource utilization, capacity forecasting, and performance trends—ensuring the availability, scalability, and resilience of our network.
Key Responsibilities
Lead continuous assessments of the application infrastructure supporting critical Mastercard applications, focusing on health, performance, monitoring and alerting, and capacity analysis. Collaborate with Product and Development teams to forecast growth requirements and ensure scalability and resiliency.
Champion observability as a core principle for infrastructure services by assessing environments and technologies to uncover gaps in monitoring and alerting. Design and implement strategies to close these gaps, ensuring all infrastructure telemetry is integrated into a unified, single‑pane‑of‑glass view. Build custom dashboards to investigate and perform root cause analysis on complex issues.
Lead regular incident reviews with internal support teams to ensure root causes are identified. When patterns of failure or compatibility issues between software and infrastructure emerge, develop and implement strategies to remediate or mitigate risks.
Leverage automation and AI technologies to enhance proactive issue detection, enable self‑healing capabilities, reducing Mean Time to Detect (MTTD) and Mean Time to Mitigate (MTTM).
Develop testing and validation plans for new environment builds, disaster recovery exercises and post‑maintenance activities to certify environment readiness before customer traffic is routed to it.
Champion continuous learning, development, and knowledge sharing across networking and other infrastructure disciplines to strengthen multi‑disciplinary SRE team capabilities. Lead training initiatives for team members and Product and Development on networking aspects of the platforms.
Evaluate vendor hardware, firmware, and software upgrade roadmaps, and conduct proof‑of‑concept (POC) testing to identify potential risks and opportunities for improvement in upcoming releases.
The Payments Network SRE team is responsible for the runtime availability of some of Mastercard’s most critical core payment systems, which support national infrastructure and operate 24/7 year‑round. As a result, this role will include periodic on‑call responsibilities when required.
All About You

5–10 years of experience in an SRE or SRE related operations role, including 3+ years supporting e‑commerce, financial services, or large‑scale SaaS platforms.
Excellent infrastructure troubleshooting and analytical problem‑solving skills.
Strong hands on experience with observability and monitoring tools such as Splunk, Dynatrace, or equivalent, with a proven ability to triage and investigate complex issues.
Familiarity with network telemetry tools such as SolarWinds and NetScout.
Proficiency in packet level debugging, including capturing traffic with tools like tcpdump and analyzing packets using Wireshark.
Broad understanding of end‑to‑end infrastructure supporting payment platforms—spanning platform services, networking, databases, and storage.
Experience with automation and Infrastructure as Code tools such as Chef, Ansible, and Terraform, as well as structured data formats (JSON/YAML).
Excellent communication skills with the ability to coordinate cross‑functional troubleshooting efforts and lead RCA processes to closure.
Demonstrated ability to troubleshoot complex production issues, perform root cause analysis, and drive long‑term corrective actions.
Experience partnering with development teams to shape architecture, define SLIs/SLOs, and embed reliability into services from design through operation.
Strong understanding of monitoring and observability ecosystems, including Prometheus, Grafana, ELK/EFK, Splunk, and OpenTelemetry.
Effective incident management skills with a structured, analytical approach to problem solving.

Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:

Abide by Mastercard’s security policies and practices;
Ensure the confidentiality and integrity of the information being accessed;
Report any suspected information security violation or breach;
Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.

#J-18808-Ljbffr

Apply
Create an E-mail Alert
Job alert activated
Saved
Save
Similar job
Staff site reliability engineer i, search developer platform
Dublin
Etsy
Site reliability engineer
Similar job
Senior site reliability engineer
Dublin
Guidewire Software
Site reliability engineer
Similar job
Lead site reliability engineer for global platforms
Dublin
Klaviyo Inc.
Site reliability engineer
Similar jobs
Engineering jobs in Dublin
jobs Dublin
jobs County Dublin
jobs Leinster
Home > Jobs > Engineering jobs > Site reliability engineer jobs > Site reliability engineer jobs in Dublin > Lead, Site Reliability Engineer (Infrastructure Operations)

About Jobijoba

  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2026 Jobijoba - All Rights Reserved

Apply
Create an E-mail Alert
Job alert activated
Saved
Save