Job Description

Job Summary:

We are seeking a skilled Data Quality Engineer to ensure the accuracy, reliability, and integrity of our data pipelines and workflows. The ideal candidate will have hands-on experience in data engineering concepts, with a strong focus on quality testing, validation, and pipeline orchestration.


Key Responsibilities:

Design, develop, and execute data quality test cases to validate data pipelines and ETL/ELT processes

Monitor and trigger data pipelines, ensuring smooth execution and timely data delivery

Run and maintain data quality scripts to identify anomalies, inconsistencies, and data integrity issues

Perform data profiling and validation across multiple data sources and targets

Collaborate with data engineers to implement data quality checks at various stages of the pipeline

Perform root cause analysis (RCA) for data anomalies and pipeline failures

Troubleshoot pipeline failures and data quality issues, working to resolve them efficiently

Document data quality standards, testing procedures, and validation results

Generate data quality reports and communicate findings with engineering teams

Develop automated testing frameworks to improve data quality validation efficiency

Focus primarily on validating and assuring quality of existing pipelines (not building full pipelines)


Required Technical Skills:

Strong understanding of data engineering concepts including ETL/ELT processes, data warehousing, and data modeling

Proficiency in SQL for complex data validation and querying

Experience with scripting languages such as Python or Shell scripting for automation

Hands-on experience with data pipeline orchestration tools (e.g., Apache Airflow, Azure Data Factory, AWS Glue)

Knowledge of data quality frameworks and tools (e.g., Great Expectations, Deequ, custom validation scripts)

Familiarity with cloud platforms (AWS, Azure, or GCP) and their data services

Understanding of data formats (JSON, Parquet, Avro, CSV) and data storage systems

Exposure to logging/monitoring tools (CloudWatch, Datadog, ELK, etc.) is a plus

Preferred Skills:

Experience with big data technologies (Spark, Hadoop, Kafka)

Knowledge of CI/CD practices for data pipelines

Familiarity with version control systems (Git)

Understanding of data governance and compliance requirements

Experience with data visualization tools for quality reporting

Soft Skills:

Strong analytical and problem-solving abilities

Excellent attention to detail

Good communication skills to collaborate with cross-functional teams

Ability to work independently and manage multiple priorities

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application