Job Description: This role is hybrid, 3 days per week in our offices at Parkwest Business Park, Dublin.
The Opportunity We are looking for an experienced Principal Engineer to lead and shape our Platform Technologies team, Foundation.
This role is critical to building and maintaining the core technical infrastructure that powers Workhuman's entire engineering organization.
You'llbe responsible for the foundational platform technologies that enable 300+ engineers to build reliable, observable, and highly performant distributed systems.
Our Foundation squad owns the technical bedrock of Workhuman's platform architecture.
We solve complex, low-level platform challenges that most engineers never need to think about—messaging infrastructure, distributed tracing, observability at scale, service-to-service communication patterns, and the deep technical systems that enable teams to focus on delivering business value.
We believe in building robust, standards-based solutions that provide excellent developer experience while maintaining operational excellence.
Current Major Initiatives Kafka Migration Program - Continue to support the enterprise-wide migration from ActiveMQ to Kafka, which involved removing distributed XA transactions and replacing with outbox pattern, establishing Schema Registry governance, and creating Kafka observability frameworks.
Unified Observability Platform - Architect OpenTelemetry-based instrumentation for all services, centralised configuration with ADOT collectors, integration with Managed Prometheus and X-Ray, unified Grafana dashboards, cost-efficient architecture operating below commercial SaaS alternatives.
Service Communication Standards - Establish Edge Authentication Lambda for token translation, header propagation libraries for distributed correlation, W3C trace context integration, and standards that make secure, traceable communication the default.
Tech Stack Apache Kafka, Kafka Streams, Schema Registry, OpenTelemetry, ADOT, Amazon Managed Prometheus, AWS X-Ray, CloudWatch, Grafana, Java 8/17, Spring Boot, Microservices, RESTful APIs, ECS Fargate, API Gateway, Lambda, Terraform, Docker, GitLab CI/CD, Oracle DB, PostgreSQL, NoSQL.
What We Can Offer You Architect and build Workhuman's entire observability strategy using OpenTelemetry, ADOT, AWS X-Ray, and Managed Prometheus - creating a unified platform for metrics, traces, and logs across all AWS accounts Continue the enterprise-wide migration from ActiveMQ to Apache Kafka, establishing event-driven architecture standards across 300+ engineers Design service-to-service communication patterns, authentication systems (Edge Auth Lambda), and distributed tracing standards using W3C trace context Solve the hardest, lowest-level platform problems that enable teams to focus on business value without thinking about infrastructure complexity Work with architects and senior engineers to shape platform architecture and establish standards that the entire engineering organisation depends on Make complex platform capabilities simple and accessible through excellent developer experience and robust frameworks The Skills You Will Bring 10+ years in enterprise Java development with Spring, microservices, distributed systems and AWS.
Deep expertise in event-driven architectures, message-oriented middleware (Apache Kafka), and patterns like outbox, saga, and eventual consistency Experience with OpenTelemetry instrumentation, distributed tracing, and observability platforms (Prometheus, Grafana, CloudWatch) Proven ability to balance cost, performance, and visibility in large-scale platform systems Skilled communicator who can translate complex platform concepts into simple, understandable patterns for engineers Achievements Created libraries, frameworks, or standards that simplify complex platform capabilities while maintaining security and governance Successfully influenced technical direction and elevated the capabilities of teams around you Successfully migrated enterprise applications from legacy systems to modern architectures at scale