Job Opportunity
We are seeking a highly skilled Test Automation Engineer with expertise in Databricks and data platforms to ensure the quality and reliability of our data solutions.
1. Main Responsibilities
* Design and implement automated testing for Databricks notebooks and workflows to enhance productivity and efficiency.
* Create test frameworks for Delta Lake tables and ACID transactions to guarantee data integrity and consistency.
* Develop automated validation for structured streaming pipelines to ensure seamless data flow and accurate results.
* Test MLflow integrations and model tracking to optimize machine learning models and improve performance.
* Validate Delta Live Tables implementations to streamline data processing and reduce errors.
* Automate testing for ETL/ELT processes in Databricks to increase efficiency and accuracy.
* Implement Spark job testing and optimization validation to guarantee high-performance execution and minimize downtime.
* Create test cases for data ingestion and processing workflows to ensure reliable data management.
* Develop automated checks for data transformations to maintain data quality and consistency.
* Test Unity Catalog features and access controls to secure data storage and management.
2. AWS Integration Testing
* Implement automated testing for Databricks-AWS integrations to ensure seamless connectivity and data exchange.
* Create test cases for S3, Glue catalog, and Lambda functions to guarantee efficient data processing and storage.
* Validate data lake storage and access patterns to maintain optimal data management and retrieval.
3. Quality Assurance
* Design and execute data quality test strategies to identify and address potential issues before they impact production.
* Implement automated data reconciliation processes to ensure data accuracy and consistency across systems.
* Develop performance testing for large-scale Spark jobs to optimize execution speed and minimize latency.
* Create cluster configuration testing to guarantee optimal resource allocation and utilization.
4. Monitoring & Reporting
* Implement pipeline monitoring test frameworks to track and analyze key performance indicators and metrics.
* Create automated test dashboards to provide real-time insights and visibility into data processing and quality.
* Generate quality metrics and testing reports to inform business decisions and drive continuous improvement.