About the job
This position is needed to harden, optimize, and scale the real‑time event‑aggregation services that power our Observability Insights/Analytics platform.
We are seeking a Staff Software Engineer with deep Java expertise to own high‑throughput stream‑processing microservices (Kafka Streams / Flink) deployed on AWS EKS, tune ClickHouse for millisecond‑latency writes, and embed observability that keeps incident minutes near zero. You will design resilient, high‑performance systems capable of processing >250K events/sec with p99 latencies under 200ms, while championing DevSecOps practices and mentoring junior engineers.
Responsibilities
Design, build, and maintain high‑performance Java microservices using Spring Boot, capable of ingesting >250K events/sec with p99 < 200 ms
Implement stateful stream‑processing pipelines (Kafka Streams / Apache Flink) with idempotent replays, exactly‑once semantics, and schema‑evolution tooling
Optimize ClickHouse schemas, partitioning, and materialized views to support multi‑region, sub‑second queries for Early Warning System (EWS) detectors
Embed OpenTelemetry instrumentation and ship comprehensive metrics/traces/logs to Datadog and Grafana with SLI/SLO dashboards
Champion DevSecOps best practices including Terraform automation, CI/CD pipelines, Kubernetes orchestration, AWS infrastructure (EKS, MSK, S3), and compliance guardrails (HIPAA, SOX, GDPR)
Leverage best‑in‑class development productivity practices including AI‑powered tooling to accelerate delivery and code quality
Mentor junior engineers and participate in rigorous code/design reviews to elevate team standards and foster knowledge sharing
Qualifications
Required:
8+ years of professional Java development experience with mastery of high‑performance and low‑latency design patterns
Production experience with Kafka Streams, Flink, or comparable stream‑processing frameworks for building real‑time data pipelines
Hands‑on ClickHouse (or columnar database) performance tuning and SQL optimization expertise
Proven success operating AWS‑hosted microservices at scale with solid Linux, Docker, and Kubernetes knowledge
Strong observability mindset including metrics, tracing, alerting, and post‑incident analysis capabilities
Excellent communication skills and a bias toward collaborative problem‑solving in cross‑functional team environments
Desired:
Experience migrating single‑region services to multi‑region active‑active topologies for high availability
Familiarity with data‑privacy controls including PII tokenization and field‑level encryption
Previous work in telecom, real‑time analytics, or compliance‑sensitive domains
Contributions to open‑source Java or streaming projects demonstrating community engagement
Location
This role will be based in our Dublin, Ireland office.
Travel
For this role, you may be required to travel occasionally to participate in project or team in‑person meetings.
What We Offer
Working at Twilio offers many benefits, including competitive pay, generous time off, ample parental and wellness leave, healthcare, a retirement savings program, and much more. Offerings vary by location.
Equal Opportunity Employment
Twilio is proud to be an equal opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Qualified applicants with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Additionally, Twilio participates in the E‑Verify program in certain locations, as required by law.
#J-18808-Ljbffr