Senior Data Engineer - Distributed Systems
">
We are seeking an experienced Senior Data Engineer to design, build and maintain robust and scalable data pipelines using Databricks and Apache Spark. If you have a strong background in distributed data processing and cloud platforms, this could be an exciting opportunity.
About the Role:
">
* Design, build and maintain complex data pipelines using Databricks and Apache Spark.
* Collaborate closely with data analysts and scientists to deliver high-quality, fit-for-purpose data solutions.
* Optimize and enhance existing ETL processes for improved performance and reliability.
* Implement and enforce data validation and testing protocols to ensure data integrity and quality.
* Work with big data tools and frameworks to manage, process and transform large-scale datasets.
* Support data modeling efforts and contribute to the design of data warehouses and analytical environments.
* Monitor, troubleshoot and resolve performance issues in data pipelines and related infrastructure.
* Contribute to code reviews and uphold best practices in data engineering standards and governance.
* Create and maintain clear documentation of data workflows, architecture and operational procedures.
* Stay informed on emerging technologies and industry trends relevant to data engineering.
* Promote innovation and continuous improvement within the team.
Requirements:
">
* 3+ years of hands-on experience in data engineering or a similar role.
* Proven expertise with Databricks and Apache Spark for distributed data processing.
* Strong programming and scripting skills in SQL, Python and/or Scala.
* Solid understanding of ETL processes, data modeling and data warehousing concepts.
* Familiarity with cloud platforms, preferably Azure.
* Knowledge of data governance, data quality frameworks and best practices.
* Strong analytical thinking and attention to detail.
* Excellent communication skills with the ability to collaborate effectively across teams.