Job Description
Job Summary:
We are seeking a skilled Data Quality Engineer to ensure the accuracy, reliability, and integrity of our data pipelines and workflows. The ideal candidate will have hands-on experience in data engineering concepts, with a strong focus on quality testing, validation, and pipeline orchestration.
Key Responsibilities:
Design, develop, and execute data quality test cases to validate data pipelines and ETL/ELT processes
Monitor and trigger data pipelines, ensuring smooth execution and timely data delivery
Run and maintain data quality scripts to identify anomalies, inconsistencies, and data integrity issues
Perform data profiling and validation across multiple data sources and targets
Collaborate with data engineers to implement data quality checks at various stages of the pipeline
Perform root cause analysis (RCA) for data anomalies and pipeline failures
Troubleshoot pipeline failures and data quality issues, working to resolve them efficiently
Document data quality standards, testing procedures, and validation results
Generate data quality reports and communicate findings with engineering teams
Develop automated testing frameworks to improve data quality validation efficiency
Focus primarily on validating and assuring quality of existing pipelines (not building full pipelines)
Required Technical Skills:
Strong understanding of data engineering concepts including ETL/ELT processes, data warehousing, and data modeling
Proficiency in SQL for complex data validation and querying
Experience with scripting languages such as Python or Shell scripting for automation
Hands-on experience with data pipeline orchestration tools (e.g., Apache Airflow, Azure Data Factory, AWS Glue)
Knowledge of data quality frameworks and tools (e.g., Great Expectations, Deequ, custom validation scripts)
Familiarity with cloud platforms (AWS, Azure, or GCP) and their data services
Understanding of data formats (JSON, Parquet, Avro, CSV) and data storage systems
Exposure to logging/monitoring tools (CloudWatch, Datadog, ELK, etc.) is a plus
Preferred Skills:
Experience with big data technologies (Spark, Hadoop, Kafka)
Knowledge of CI/CD practices for data pipelines
Familiarity with version control systems (Git)
Understanding of data governance and compliance requirements
Experience with data visualization tools for quality reporting
Soft Skills:
Strong analytical and problem-solving abilities
Excellent attention to detail
Good communication skills to collaborate with cross-functional teams
Ability to work independently and manage multiple priorities
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application