Principal Data Engineer
Lead a team of data engineers building scalable, secure, and high-performance data solutions on Databricks and AWS.
Key Responsibilities:
* Design and implement cutting-edge Databricks-based Lakehouse platforms (Delta Lake, Spark, MLflow)
* Develop seamless integrations with AWS services including S3, Glue, Lambda, and Step Functions
* Create efficient ETL/ELT pipelines using Spark (Python/Scala) to optimize data processing
* Automate infrastructure management to enhance scalability and reliability
* Tune Spark job and cluster configurations for optimal performance
* Implement robust security governance policies using IAM, VPC, and Unity Catalog
* Drive Agile delivery cycles with a high-performing engineering team
Essential Skills and Qualifications:
Data Engineering expertise and strong leadership abilities
Extensive experience in production environments with Databricks
Advanced knowledge of AWS services: S3, Glue, Lambda, VPC, IAM, EMR
Strong programming skills in Python (PySpark), Scala, and SQL
Expertise in CI/CD pipelines, Git workflows, and automated testing
Familiarity with data modeling and warehousing (e.g., Redshift, Postgres)
Proficiency in workflow and orchestration tools (e.g., Airflow, Step Functions)
Why This Role Matters:
This is an exciting opportunity to leverage your technical expertise and leadership skills to drive business growth and innovation. As a Principal Data Engineer, you will play a critical role in shaping the future of our organization's data strategy and empowering our teams to make data-driven decisions.