Crone Corkill have partnered with a technology consultancy who are searching for a Site Reliability Engineer to join a client in their Dublin office on a permanent basis. Expertise with Apache Kafka within a production environment is absolutely key here, with strong knowledge and experience across Kafka architecture, security, clusters, stream processing and distributed systems being vital.
Working as part of a diverse team, you’ll be heavily involved in a DevOps transformation, which involves production readiness, supporting developers during the application build phase, triage, root cause and more.
What will you do as a Site Reliability Engineer?
Operate and administer Apache Kafka clusters, including monitoring, scaling, security, and troubleshooting
Work as a key contact responsible for ensuring application scalability, performance, and resilience
Design, build, and maintain event-driven architectures to support scalable and resilient applications
Collaborate with development teams to integrate SRE best practices (SLIs, SLOs, SLAs, error budgets, etc.)
Automate operational tasks, CI/CD pipelines, and system monitoring to reduce manual interventions
Manage and optimise PCF (Pivotal Cloud Foundry) deployments, ensuring application performance and availability
Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating
Automate data-driven alerts to proactively escalate issues. Work with development teams to establish SLOs and improve reliability
Partner with the development and product team of a new application to establish the right monitoring and alerting strategy
Develop and manage observability and monitoring solutions using Splunk, ensuring proactive issue detection and resolution
Contribute to infrastructure as code (IaC) and cloud-native deployments
What skills do you need as a Site Reliability Engineer?
Apache Kafka within a production environment (including architecture, brokers, topics, partitions and replicas)
Kafka security (SSL, SASL & ACLs)
Exposure to Splunk, including logging, dashboards, alerting and operational insights
Configuring, deploying and managing Kafka clusters in cloud & on-prem environments
Kafka stream processing, using Kafka Streams, KSQL or Apache Flink
Proficiency in Java, Scala or Python for Kafka related development tasks
Familiarity with DevOps practices, including CI/CD pipelines, monitoring and logging
Experience with tools like Zookeeper, Schema Registry, and Kafka Connect
Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Information Technology
Industries
Staffing and Recruiting
Financial Services
Referrals increase your chances of interviewing at Crone Corkill by 2x
#J-18808-Ljbffr