Job Description
Data & Databricks Test Automation Engineer
Company Overview
Citco is a global leader in financial services, delivering innovative solutions to some of the world's largest institutional clients. We are seeking a Test Automation Engineer specializing in Databricks and data platforms to ensure the quality and reliability of our data solutions.
Role Description
As a Data & Databricks Test Automation Engineer, you will be responsible for developing and implementing automated testing frameworks for Databricks-based solutions, data pipelines, and data quality validation. You will work closely with data engineering teams to ensure data accuracy and reliability across our Lakehouse architecture.
Key Responsibilities
* Databricks Testing
* Design and implement automated testing for Databricks notebooks and workflows
* Create test frameworks for Delta Lake tables and ACID transactions
* Develop automated validation for structured streaming pipelines
* Validate Delta Live Tables implementations
* Data Pipeline Testing
* Automate testing for ETL/ELT processes in Databricks
* Implement Spark job testing and optimization validation
* Create test cases for data ingestion and processing workflows
* Develop automated checks for data transformations
* Test Unity Catalog features and access controls
* Quality Assurance
* Design and execute data quality test strategies
* Implement automated data reconciliation processes
* Develop performance testing for large-scale Spark jobs
* Monitoring & Reporting
* Implement pipeline monitoring test frameworks
* Create automated test dashboards
* Generate quality metrics and testing reports
* Maintain comprehensive test documentation
Requirements & Qualifications
* Educational Background
* Bachelor's degree in Computer Science, Data Science, or related field
* Relevant certifications in Databricks or data testing are a plus
* Technical Experience
* 2+ years hands-on experience with Databricks (Spark)
* Strong programming skills in Python (PySpark) and SQL
* Experience with data testing frameworks and tools
* Knowledge of AWS services (S3, Glue, Lambda)
* Understanding of Delta Lake and Lakehouse architecture
* Experience with version control systems (Git)
* Additional Skills
* Strong analytical and problem-solving abilities
* Experience with large-scale data processing
* Knowledge of data quality best practices
* Understanding of data governance and compliance requirements
* Experience with Agile methodologies
* Platform Knowledge
* Databricks workspace and notebook development
* Delta Lake and Delta Live Tables
* Unity Catalog for governance testing
* Spark optimization and performance testing