Job Description

Job Description

Role: Databricks PySpark Developer

Experience: 5+ Years

Location: Bengaluru / Pune / Remote

Company: Vsquare Systems Pvt. Ltd

About Vsquare Systems

Vsquare Systems is a technology-driven organization delivering scalable data, cloud, and digital solutions to global clients. We specialize in building modern data platforms, analytics solutions, and enterprise-grade systems that enable data-driven decision-making.


Role Overview

We are looking for a highly skilled Databricks PySpark Developer to join our data platform implementation team. In this role, you will be responsible for designing, developing, and optimizing scalable ETL pipelines and data workflows using Databricks and Apache Spark. You will work closely with data engineers, data scientists, and BI teams to support advanced analytics and reporting requirements.


Key Responsibilities

1. ETL Development & Data Engineering

  • Design, develop, and maintain scalable ETL processes using Databricks PySpark .
  • Extract, transform, and load data from heterogeneous sources into Data Lake and Data Warehouse environments.
  • Optimize ETL workflows for performance, scalability, and cost efficiency using Spark SQL and PySpark.
  • Implement robust error handling, logging, and monitoring mechanisms for ETL jobs.
  • Design and implement data solutions following Medallion Architecture (Bronze, Silver, Gold layers) .
  • Ensure data is cleansed, enriched, validated, and optimized at each layer for analytics consumption.

2. Data Pipeline Management

  • Hands-on experience in building and managing advanced data pipelines using Databricks Workflows .
  • Develop and maintain reliable, reusable, and scalable pipelines ensuring data quality and integrity .
  • Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines.

3. Data Analysis & Query Optimization

  • Write, review, and optimize complex SQL queries for data transformation, aggregation, and analysis.
  • Perform query tuning and performance optimization on large-scale datasets within Databricks.

4. Project Coordination & Continuous Improvement

  • Participate in project planning, estimation, and delivery activities.
  • Stay updated with the latest features in Databricks, Spark, and cloud data platforms , and recommend best practices.
  • Document ETL processes, data lineage, metadata, and workflows to support data governance and compliance .
  • Mentor junior developers and contribute to team knowledge sharing where required.


Required Qualifications

  • Bachelor’s degree in Computer Science, Engineering, or a related field .
  • 5+ years of experience in ETL/Data Engineering roles with strong focus on Databricks PySpark .
  • Strong proficiency in Python , with hands-on experience in developing and debugging PySpark applications.
  • In-depth understanding of Apache Spark architecture , including RDDs, DataFrames, and Spark SQL.
  • Expertise in SQL development and optimization for large-scale data processing.
  • Proven experience working with data warehousing concepts and ETL frameworks .
  • Strong problem-solving and troubleshooting skills.
  • Excellent communication and collaboration skills.


Preferred Qualifications

  • Experience working on cloud platforms , preferably AWS .
  • Hands-on experience with tools such as Databricks, Snowflake, Tableau , or similar data platforms.
  • Strong understanding of data governance, data quality, and best practices in data engineering.
  • Relevant certifications in Databricks, PySpark, Spark SQL , or cloud technologies.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application