Job Description

Job title : Data Engineer 

Location :  Bangalore - Hybrid

Experience :  4+ Years


Key Responsibilities:  

  • Administer and manage Databricks clusters, workspaces, and resources. 
  • Monitor platform health, availability, and performance. 
  • Develop and maintain data pipelines for ingesting, transforming, and loading data into Databricks. 
  • Optimize ETL processes for efficiency and scalability. 
  • Collaborate with data scientists and analysts to build and optimize data processing and analytics workflows using Databricks notebooks. 
  • Develop and implement data transformation and data analysis solutions. 
  • Implement and manage data lakes within Databricks for structured and unstructured data storage. 
  • Ensure data lake security and access controls. 
  • Identify and address performance bottlenecks in Databricks workloads. 
  • Optimize queries, data pipelines, and resource allocation for efficient data processing. 
  • Implement security measures to protect data within Databricks, including access controls, encryption, and auditing. 
  • Ensure compliance with data privacy regulations and industry standards. 
  • Integrate Databricks with various data sources, databases, and external systems to enable seamless data flow. 
  • Ensure data source connectivity and data consistency. 
  • Manage cluster scaling and resource allocation to meet performance and cost requirements. 
  • Optimise cluster configurations for specific workloads. 
  • Identify and resolve technical issues, errors, and anomalies in Databricks workflows. 
  • Provide support to users encountering problems. 
  • Collaborate with data engineers, data scientists, and analysts to understand data requirements and provide technical guidance. 
  • Conduct training sessions and knowledge sharing to empower users with Databricks capabilities. 
  • Create and maintain documentation for Databricks workflows, configurations, and best practices. 
  • Promote and enforce coding and data engineering standards. 
  • Monitor and optimise Databricks costs, including cluster utilisation and resource allocation. 
  • Ensure cost-efficient data processing. 
  • Stay current with Databricks updates and enhancements. 
  • Continuously improve Databricks workflows, processes, and infrastructure. 
  • Establish data governance practices within Databricks, including metadata management and data cataloguing. 
  • Maintain data lineage and documentation for data assets. 
  • Implement data quality checks and validation processes to ensure data accuracy and reliability. 
  • Develop and enforce data quality standards. 
  • Plan for scalability and resilience in Databricks infrastructure to accommodate growing data volumes and ensure high availability. 
  • Additional duties as assigned. 

Qualifications:  

  • Bachelor's Degree: A bachelor's degree in computer science, information technology, data engineering, or a related field is often required. Some employers may consider equivalent work experience or relevant certifications. 
  • A minimum of 4 years of hands-on experience working with Databricks as a data engineer, data scientist, or similar role is typically expected. 
  • Strong experience in data engineering, including designing and developing data pipelines, ETL processes, and data transformations. 
  • Familiarity with big data technologies and frameworks such as Apache Spark, Hadoop, or similar platforms commonly used in conjunction with Databricks. 
  • Proficiency in programming languages commonly used in Databricks, such as Python, Scala, or SQL. Knowledge of Scala is often preferred. 
  • Proven ability to identify and address performance bottlenecks in Databricks workloads, optimising query performance, and resource allocation. 
  • A commitment to staying updated with the latest Databricks updates, enhancements, and emerging technologies in the data engineering and analytics field. 

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application