Job spec: Senior Data Engineer
Job Type: FTE
Description
The Senior Data Engineer will work closely with the data, Architecture,
Business Analyst, Data Stewards to integrate and align requirements,
specifications and constraints of each element of the requirement. They will
also need to help identify gaps in resources, technology, or capabilities
required, and work with the data engineering team to identify and
implement solutions where appropriate.
Primary Responsibilities:
Integrate data from multiple on prem and cloud sources andsystems. Handle data ingestion, transformation, and
consolidation to create a unified and reliable data foundation
for analysis and reporting.
Develop data transformation routines to clean, normalize, and
aggregate data. Apply data processing techniques to handle
complex data structures, handle missing or inconsistent data,
and prepare the data for analysis, reporting, or machine
learning tasks.
Implement data de-identification/data masking in line with
company standards.
Monitor data pipelines and data systems to detect and resolve
issues promptly.
Develop monitoring tools to automate error handling
mechanisms to ensure data integrity and system reliability.
Utilize data quality tools like Great Expectations or Soda to
ensure the accuracy, reliability, and integrity of data
throughout its lifecycle.
Create & maintain data pipelines using Airflow & Snowflake as
primary tools
Create SQL Stored procs to perform complex transformation
Understand data requirements and design optimal pipelines to fulfil
the use-cases
Creating logical & physical data models to ensure data integrity is
maintained
CI CD pipeline creation & automation using GIT & GIT Actions
Tuning and optimizing data processes
Qualifications
Required Qualifications:
Bachelor's degree in Computer Science or a related field.
Proven hands-on experience as a Data Engineer.
Proficiency in SQL (any flavor), with experience using Window
functions and advanced features.
Excellent communication skills.
Strong knowledge of Python.
In-depth knowledge of Snowflake architecture, features, and best
practices.
Experience with CI/CD pipelines using Git and Git Actions.
Knowledge of various data modeling techniques, including Star
Schema, Dimensional models, and Data Vault.
Hands-on experience with:
Developing data pipelines (Snowflake), writing complex SQL
queries.
Building ETL/ELT/data pipelines.
Related/complementary open-source software platforms
and languages (e.g., Scala, Python, Java, Linux).
Experience with both relational (RDBMS) and non-relational
databases.
Analytical and problem-solving skills applied to big data datasets.
Experience working on projects with agile/scrum methodologies and
high-performing teams.
Good understanding of access control, data masking, and row access
policies.
Exposure to DevOps methodology.
Knowledge of data warehousing principles, architecture, and
implementation.
Preferred Qualifications:
Bachelor's degree or higher in Database Management, Information
Technology, Computer Science, or a related field.
3-5 years of experience in Data Engineering.
Motivated self-starter who excels at managing tasks independently and
takes ownership.
Experience orchestrating data tasks in Airflow to run on Kubernetes for
data ingestion, processing, and cleaning.
Expertise in designing and implementing data pipelines to process high
volumes of data.
Ability to create Docker images for applications to run on Kubernetes
Familiarity with Azure Services such as Blobs, Functions, Azure Data
Factory, Service Principal, Containers, Key Vault, etc.