Data Architect - MDM

📍 Pune, Maharashtra, India
Full-time Other-General Posted January 16, 2026
Apply Now Similar Jobs
Job Description

Job Description 
Job Title: Associate Data Architect – Master Data Management (MDM)  
 
Location:  
 
Pune - Hybrid  
 
Experience:  
 
10+ years of experience in Data Architecture, Data Engineering/Integration with strong exposure into Data Modelling and Database (RDBMS) Management. 
 
About the Role  
 
We are seeking an Associate Data/Database Architect  to join our core product architecture team building an enterprise-grade, multi-domain Master Data Management (MDM) product platform .  
 
You will play a key role in optimizing and extending the MDM data model , implementing efficient data ingestion and entity resolution mechanisms , and ensuring the system supports multiple domains such as Party (Individual/Organization), Product, Location, Policy, and Relationship  in a cloud-native and scalable  manner. 
 
Key Responsibilities  
 
Data Modeling & Architecture  
 
Enhance and extend the existing Party-based data model  into a multi-domain MDM schema  (Party, Product, Location, Relationship, Policy, etc.). 
 
Design and maintain canonical data models  and staging-to-core mappings  for multiple source systems. 
 
Implement auditability, lineage, and soft-delete frameworks  within the MDM data model. 
 
Contribute to the creation of golden records , trust scores , match/merge logic , and data survivorship rules . 
 
Ensure the model supports real-time and batch data mastering  across multiple domains. 
 
Data Engineering & Integration  
 
Help support to optimize data ingestion and ETL/ELT pipeline  using Python, PySpark, SQL, and/or Informatica  (or equivalent tools). 
 
Design and implement data validation, profiling, and quality checks  to ensure consistent master data. 
 
Work on data harmonization , schema mapping , and standardization  across multiple source systems. 
 
Help build efficient ETL mappings  from canonical staging layers to MDM core data models in PostgreSQL . 
 
Develop REST APIs  or streaming pipelines (Kafka/Spark)  for real-time data processing and entity resolution. 
 
Cloud & Platform Engineering  
 
Implement and optimize data pipelines on AWS  or Azure  using native services (e.g., AWS Glue, Lambda, S3, Redshift, Azure Data Factory, Synapse, Data Lake). 
 
Deploy and manage data pipelines and databases following cloud-native, cost-effective, and scalable  design principles. 
 
Collaborate with DevOps teams for CI/CD , infrastructure-as-code , data pipeline  and database deployment/migration automation . 
 
Governance, Security & Compliance  
 
Implement data lineage, versioning, and stewardship  processes. 
 
Ensure compliance with data privacy and security standards  (GDPR, HIPAA, etc.). 
 
Partner with Data Governance teams to define data ownership, data standards, and stewardship workflows . 
 
Requirements Technical Skills Required  
 
Core Skills  
 
Data Modelling:  Expert-level in Relational (3NF)  and Dimensional (Star/Snowflake)  modelling; hands-on in Party data model , multi-domain MDM , and canonical models . 
 
Database:  PostgreSQL (preferred), or any enterprise RDBMS. 
 
ER Modelling Tool –  Erwin/ERStudio, Database Markup Language (DBML). 
 
ETL / Data Integration:  Informatica, Python, PySpark, SQL, or similar tools. 
 
Cloud Platforms:  AWS or Azure. 
 
Programming:  Advanced SQL , Python , PySpark , and/or UNIX/Linux scripting . 
 
Data Quality & Governance:  Familiarity with data quality rules , profiling , match/merge , and entity resolution . 
 
DevOps - Version Control & CI/CD:  Git, Azure DevOps, Jenkins, Terraform, Redgate Flyway (preferred) 
 
Database Design & Optimization (PostgreSQL)  
 
Design and maintain normalized and denormalized models  using advanced features (schemas, partitions, views, CTEs, JSONB, arrays). 
 
Build and optimize complex SQL queries , materialized views , and data marts  for performance and scalability. 
 
Tune RDBMS (PostgreSQL) performance  – indexes, query plans, vacuum/analyze, statistics, parallelism, and connection management. 
 
Leverage RDBMS  (PostgreSQL) extensions  such as: 
 
pg_trgm for fuzzy matching and probabilistic search. 
 
fuzzystrmatch, pg_vector for semantic similarity and name matching. 
 
hstore, jsonb for flexible attribute storage. 
 
Implement RBAC , row-level security , partitioning , and logical replication  for scalable MDM deployment. 
 
Work with stored procedures, functions, and triggers  for data quality checks and lineage automation. 
 
Implement HA/DR , backup/restore , database-level encryption  (at-rest, in-transit), column-level encryption for PII/PHI data. 
 
Good to Have  
 
Knowledge of Master Data Management (MDM ) - Customer, Product etc. 
 
Experience with graph databases  like Neo4j  for relationship and lineage tracking. 
 
Knowledge of probabilistic and deterministic matching , ML-based entity resolution , or AI-driven data mastering . 
 
Experience in data cataloging , data lineage tools , or metadata management platforms . 
 
Familiarity with data security frameworks  and Well-Architected Framework  principles. 
 
Soft Skills  
 
Strong analytical, conceptual and problem-solving skills. 
 
Ability to collaborate in a cross-functional, agile  environment. 
 
Excellent communication and documentation skills. 
 
Self-driven, proactive, and capable of working with minimal supervision. 
 
Strong desire to innovate and build scalable, reusable data frameworks . 
 
Education  
 
Bachelor’s or master’s degree in computer science, Information Technology, or related discipline. 
 
Certifications in AWS/Azure , Informatica , or Data Architecture  are a plus. 
 
Benefits Why Join Us  
 
Be part of a cutting-edge MDM product initiative  blending data architecture, engineering, AI/ML , and cloud-native design . 
 
Opportunity to shape the next-generation data mastering framework  for multiple industry domains. 
 
Gain deep exposure to data mastering, lineage, probabilistic search, and graph-based relationship management . 
 
Competitive compensation, flexible working and a technology-driven culture. 
 
Requirements 
Requirements · Proficiency in Python programming. · Advanced knowledge in mathematics and algorithm development. · Experience in developing machine learning and deep learning models. · Strong understanding of neural network architectures, with emphasis on GenAI and LLMs. · Skilled in data processing and visualization. · Experienced in natural language processing. · Knowledgeable in AI/ML deployment, DevOps practices, and cloud services. · In-depth understanding of AI security principles and practices.
Apply for this Position

Ready to join ? Click the button below to submit your application.
Submit Application
Job Details

Location
Pune, Maharashtra, India
Job Type
Full-time