Senior Data Engineer – (Apache Spark)
We are seeking an experienced
Senior Data Engineer
to join a high-performing team. This role is central to designing, building, and evolving large-scale data platforms that power analytics, insights, and operational decision-making. You will bring strong expertise in
Apache Spark
, contributing to the development and optimisation of distributed data pipelines within a modern
Lakehouse architecture
. Working in a cloud-native environment built on technologies such as
Delta Lake and Databricks
, you'll play a key role in scaling and enhancing a data ecosystem that supports both batch and streaming workloads.
Location:
Limerick (Hybrid – minimum 3 days onsite per week during probation)
Responsibilities
Data Engineering & Architecture
* Design, develop, and optimise scalable distributed data pipelines using
Apache Spark
* Build reliable batch and streaming solutions following modern
Lakehouse design patterns
* Develop and maintain workflows in a
Databricks
environment (or quickly ramp up if new)
* Apply advanced Spark techniques for performance tuning, partitioning strategies, and job optimisation
* Ingest and process structured, semi-structured, and unstructured data sources
Technical Leadership
* Act as a subject-matter expert for Spark-based data engineering best practices
* Lead and contribute to code reviews, ensuring high standards in quality, testing, and documentation
* Help define data ingestion standards and guidelines to ensure consistency and reliability
Collaboration & Delivery
* Partner closely with architects and product stakeholders to translate business requirements into scalable data solutions
* Collaborate with application and platform engineers to ensure seamless integration of data workflows
* Communicate technical concepts clearly to both technical and non-technical audiences
* Ensure adherence to development processes, deployment standards, and change control
Required Experience & Skills
* 8+ years' experience in software or data engineering, with a strong focus on distributed data systems
* Deep hands-on expertise with
Apache Spark
(DataFrames, Spark SQL, execution plans, shuffle optimisation)
* Experience with
Databricks
, or the ability to transition quickly with strong Spark fundamentals
* Strong proficiency in
Python
and
SQL
* Solid understanding of big data architectures, ETL/ELT pipelines, and cloud-native data patterns
* Experience with relational databases and version control systems (Git/TFS/SVN)
* Familiarity with CI/CD pipelines (e.g. Azure DevOps, GitHub Actions, Octopus Deploy)
* Strong communication skills and experience working in
Agile
environments
* Fluent written and spoken English
Preferred Skills
* Hands-on production experience with Databricks (notebooks, workflows, Delta Lake, cluster configuration)
* Experience with
Azure data services
such as ADLS Gen2 and Azure Data Factory
* Knowledge of
Delta Live Tables
,
Delta Sharing
, and Spark Structured Streaming
* Experience with NoSQL data stores and event-based ingestion (EventHub / EventGrid)
* Familiarity with Infrastructure-as-Code tools (Terraform, ARM templates)
* Understanding of data governance, cataloguing, and data management best practices
* Exposure to machine learning or data science concepts