Job Description
Role Overview
We are looking for a Manager / Lead – Data Engineering & ETL Operations to drive enterprise-scale data integration and analytics solutions. The role requires strong hands-on expertise in ETL, cloud platforms, and data warehousing, along with the ability to lead teams, manage delivery, and solve complex data problems.
This is a hands-on leadership role, ideal for professionals transitioning from Senior Engineer to Manager.
Key Responsibilities
Lead ETL development, operations, and production support
Design and manage cloud-based data platforms (AWS / Azure / GCP)
Build and optimize data warehouse solutions using Snowflake / Redshift
Implement dimensional models including SCD Type 1/2/3
Develop and orchestrate pipelines using SSIS, Matillion, SnapLogic, PySpark
Ensure data quality, performance, scalability, and reliability
Support downstream analytics and reporting (Tableau / Power BI)
Lead small-to-medium teams, perform code reviews, and mentor engineers.
Work closely with business and analytics stakeholders
Drive root-cause analysis and problem resolution for data issues
Requirements
Must-Have
Strong hands-on experience in ETL & Data Engineering
Expertise in:
Snowflake / Redshift
SSIS, Matillion, SnapLogic
PySpark
Solid understanding of Data Warehousing & SCD dimensions
Experience with AWS / Azure / GCP
Strong problem-solving and analytical skills
Team leadership and delivery ownership experience
Good-to-Have
Pharma / Life Sciences domain exposure
Tableau and/or Power BI experience
Experience with production support & data ops
Requirements
Key Responsibilities Pipeline Development: Design, build, and maintain efficient and scalable ETL/ELT pipelines on the Databricks platform using PySpark, SQL, and Delta Live Tables (DLT). Lakehouse Management: Implement and manage data solutions within the Databricks Lakehouse Platform, ensuring best practices for data storage, governance, and management using Delta Lake and Unity Catalog. Code Optimization: Write high-quality, maintainable, and optimized PySpark code for large-scale data processing and transformation tasks. AI & ML Integration: Collaborate with data scientists to productionize machine learning models. Utilize Databricks AI features such as the Feature Store, MLflow for model lifecycle management, and AutoML for accelerating model development. Data Quality & Governance: Implement robust data quality checks and validation frameworks to ensure data accuracy, completeness, and reliability within the delta tables. Performance Tuning: Monitor, troubleshoot, and optimize the performance of Databricks jobs, clusters, and SQL warehouses to ensure efficiency and cost-effectiveness. Collaboration: Work closely with data analysts, data scientists, and business stakeholders to understand their data requirements and deliver effective solutions. Documentation: Create and maintain comprehensive technical documentation for data pipelines, architectures, and processes. Required Qualifications & Skills Experience: 3-5 years of hands-on experience in a data engineering role. Databricks Expertise: Proven, in-depth experience with the Databricks platform, including Databricks Workflows, Notebooks, Clusters, and Delta Live Tables. Programming Skills: Strong proficiency in Python and extensive hands-on experience with PySpark for data manipulation and processing. Data Architecture: Solid understanding of modern data architectures, including the Lakehouse paradigm, Data Lakes, and Data Warehousing. Delta Lake: Hands-on experience with Delta Lake, including schema evolution, ACID transactions, and time travel features. SQL Proficiency: Excellent SQL skills and the ability to write complex queries for data analysis and transformation. Databricks AI: Practical experience with Databricks AI/ML capabilities, particularly MLflow and the Feature Store. Cloud Experience: Experience working with at least one major cloud provider (AWS, Azure, or GCP). Problem-Solving: Strong analytical and problem-solving skills with the ability to debug complex data issues. Communication: Excellent verbal and written communication skills. Preferred Qualifications Databricks Certified Data Engineer Associate/Professional certification. Experience with CI/CD tools (e.g., Jenkins, Azure DevOps, GitHub Actions) for data pipelines. Familiarity with streaming technologies like Structured Streaming. Knowledge of data governance tools and practices within Unity Catalog.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application