Overall 5+ years of work experience with Site Reliability Engineering2+ years of work experience with kafka-monitor2+ years of work experience with Apache KafkaFamiliarity with clustering, replication, persistence, and high‑availability patterns.Experience working in regulated environments with strong change management practices.Exposure to automation, reliability engineering, or SRE best practices.What you will do:Administer Confluent Kafka clusters including installation, configuration, upgrades, and maintenance in Linux environments.Implement and support Kafka security using SSL/TLS for encryption, SASL authentication, and ACLs for topic-level authorization.Configure secure Kafka clients (producers, consumers, and connectors) with keystore and truststore management.Monitor Kafka cluster health and performance to ensure high availability and minimal downtime.Troubleshoot Kafka-related issues such as broker failures, consumer lag, authentication errors, and connector failures.Support and manage Kafka Connect connectors, ensuring reliable data ingestion and delivery across systems.Assist in broker scaling activities such as broker addition/removal and basic partition reassignment to balance cluster load.Collaborate with application and infrastructure teams to integrate Kafka with enterprise systems and optimize streaming performance.Performed topic life cycle management including creation, deletion, partition increase, replication factor planning, and retention tuning.
#J-18808-Ljbffr