Job Description

Position: Data Quality Engineer (8 to 12 Years)

Location: Bangalore

Notice Period: Immediate


Job Description — Data Quality Engineer (8–12 Years) | Location - Bangalore

Role Summary

We are seeking a hands-on Data Quality Engineer to design, implement, and operate automated data quality controls across an AWS + Databricks Lakehouse platform. This role ensures trusted data from ingestion (streaming/batch) through curated (Gold) layers by implementing quality rules, validation frameworks, monitoring, and remediation workflows aligned to business and governance standards.

Key Responsibilities

Data Quality Strategy & Rule Implementation

  • Define and implement data quality dimensions (accuracy, completeness, timeliness, consistency, uniqueness, validity) across Bronze/Silver/Gold datasets.
  • Partner with business/data owners to translate requirements into DQ rules , thresholds, and acceptance criteria.
  • Maintain a DQ rule repository and ensure versioning, traceability, and approvals.

Automation & Frameworks

  • Build and operationalize automated data checks using:
  • Databricks (PySpark, Spark SQL) and/or AWS Glue (PySpark) jobs
  • DQ frameworks such as Great Expectations, Deequ , or custom rule engines
  • Embed quality gates into pipelines (pre/post checks, quarantine patterns, fail-fast vs warn policies).
  • Create reusable DQ components (rule templates, test suites, profiling modules).

Monitoring, Alerting & Incident Management

  • Set up DQ monitoring dashboards (Databricks dashboards / CloudWatch / third-party observability).
  • Configure alerting for threshold breaches and anomalies (schema drift, outliers, volume spikes, null surges).
  • Perform root cause analysis and lead remediation with pipeline owners and source system teams.

Metadata, Lineage & Governance Alignment

  • Contribute DQ metadata into the data catalog/governance tool (e.g., Atlan) including rule coverage, scorecards, and dataset certification.
  • Support lineage-driven quality impact analysis and audit readiness.

Quality Reporting & Scorecards

  • Publish and maintain data quality scorecards by domain/dataset (daily/weekly) with trends and SLA adherence.
  • Track issue backlog, triage priority, and resolution SLAs.

Must-Have Skills & Experience

  • 8–12 years in Data Engineering / Data Quality Engineering, with strong focus on DQ automation.
  • Strong hands-on with Databricks (PySpark, Spark SQL, Delta/Iceberg concepts).
  • Strong on AWS data ecosystem: S3, Glue (Catalog/Jobs), IAM, CloudWatch , KMS.
  • Experience validating Iceberg/Parquet datasets and managing schema evolution/schema drift.
  • Proficiency in Python , SQL , and writing testable data transformation logic.
  • Experience in batch and/or streaming validation patterns (micro-batch/near real-time).
  • Exposure to CI/CD for data pipelines (Azure DevOps/GitHub/Jenkins) and automated test execution.
  • Solid understanding of data modeling and canonical layer patterns.

Nice-to-Have

  • Experience with Unity Catalog , Databricks Workflows , Lakehouse Monitoring , DBSQL .
  • Experience with Power BI semantic models and ensuring quality for reporting datasets.
  • Familiarity with Data Observability tools (Monte Carlo/Databand/Bigeye etc.) or building internal equivalents.
  • Knowledge of MDM, data governance , and policy-driven controls.
  • Domain experience: Manufacturing/Supply Chain/ERP/ServiceNow/Workday type sources.

Qualifications

  • Bachelor’s/Master’s in Computer Science, Engineering, or equivalent.
  • Strong communication skills to drive alignment with data owners and stakeholders.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application