We're hiring a senior, hands-on Data Platform Engineer to design and build a modern data platform from the ground up.
This is a build-first role, focused on engineering quality, scalability, and real delivery.
You'll take ownership of creating a new open-source-led data lake / lakehouse, integrating multiple data sources and enabling analytics, operational use cases, and future AI/ML workloads.
Role Design and build a greenfield data lake / lakehouse platform Engineer high-throughput batch and streaming pipelines Implement scalable processing using open-source technologies, including:Apache Spark (batch and structured processing)Apache Flink (real-time and streaming pipelines)Trino or equivalent distributed SQL engines Implement and operate modern table formats such as Apache Iceberg, Delta Lake, or Hudi Build ingestion, transformation, and consolidation frameworks across multiple data sources Own delivery end-to-end, from design through production, optimisation, and support Ensure data is reliable, scalable, and usable for analytics, reporting, and AI/ML use cases Requirements 10+ years commercial experience Proven experience building and operating data platforms in production End to end Datalake design and build experience Hands-on experience with:Apache SparkApache Flink or equivalent streaming enginesApache Iceberg, Delta Lake, or Apache Hudi Experience working with object-storage-backed data lakes (S3, ADLS, GCS, MinIO) Strong coding skills in Python, Scala, or Java Experience integrating multiple data sources into a unified platform Comfortable owning technical decisions and working in delivery-focused environments Pragmatic mindset with a bias towards building, optimising, and shipping