Senior Site Reliability Engineer - Azure Red Hat OpenShift (Ireland, Italy, Portugal, Czech Republic)Red Hat Waterford, County Waterford, IrelandOverviewThe Red Hat OpenShift Dedicated Site Reliability Engineering (SRE) team is looking for a Senior Software Engineer to join our global team. In this role, you will work on Red Hat OpenShift, enterprise Kubernetes, as part of a team that develops and operates Red Hat OpenShift Dedicated, a public cloud service based on Red Hat OpenShift for large enterprise customers. You will contribute to the design and development of automation software to provision, upgrade, monitor, and heal a large global fleet of OpenShift clusters deployed across multiple public clouds. You will participate in a global on-call rotation and help lead incident management, root cause analysis, and continuous improvement activities, managing engineering efforts against an SLA and error budget.ResponsibilitiesDesign and write automation software to provision, upgrade, monitor, and heal a large global fleet of OpenShift clusters deployed across multiple public cloudsIdentify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutionsParticipate in release cycles of our offerings, deploying code to integration, staging, and production environments, integrating with CI/CD tooling, monitoring, and change managementPerform software updates, peer code reviews, testing, and CVE analysis; respond to security threatsInteract with automated monitoring and healing infrastructure to ensure healthy environmentsProvide engineering support to Red Hat's global technical support team to resolve customer issuesCreate and maintain standard operating procedures (SOPs) for maintenance tasks, applying configuration changes, and remediating problemsParticipate in a global on-call rotation, including periodic weekend and holiday on-call dutiesWhat you will bring3+ years of software engineering experience using object-oriented languages; Golang and Python preferredExperience managing Linux-based systems in a public cloud (AWS, GCP, or Microsoft Azure)Commercial experience with enterprise system monitoring; knowledge of Prometheus is a plusExperience with container technology, Kubernetes, OpenShift, and configuration management tools (Red Hat Ansible Automation, Puppet, or Chef) is a big plusDemonstrated ability to troubleshoot systems issues quickly and accuratelySolid written and verbal communication skills in EnglishAbout Red HatRed Hat is the world’s leading provider of enterprise open source software solutions, delivering Linux, cloud, container, and Kubernetes technologies with a community-powered approach. We support flexible work environments and encourage employees to contribute ideas regardless of title or tenure. We are committed to open collaboration and inclusion.Inclusion at Red HatOur culture is based on transparency, collaboration, and inclusion, empowering people from diverse backgrounds to share ideas and drive innovation. We strive for equal opportunity and welcome applicants from all backgrounds.Equal Opportunity Policy (EEO)Red Hat is an equal opportunity workplace and an affirmative action employer. We review applications without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, disability, medical condition, marital status, or other legally protected characteristics.Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees related to unsolicited resumes or CVs except as required in a contract.Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email application-assistance@redhat.com.
#J-18808-Ljbffr