Job Description

Roles & Responsibilities:

  • Develop and maintain data pipelines and workflows using PySpark on Databricks
  • Write Python scripts for data processing, transformation, and integration with Spark
  • Utilize PySpark and related libraries (e.g., pyspark-ai) to incorporate AI and machine learning functionalities into Spark workflows
  • Perform data analysis, transformations, and aggregations on large datasets
  • Deploy and optimize machine learning models within the Databricks environment
  • Collaborate with data engineers, data scientists, and business teams to understand requirements and deliver solutions
  • Use the Databricks platform to manage, monitor, and optimize Spark jobs and workflows
  • Work with the English SDK for Apache Spark to facilitate natural language interactions with Spark DataFrames
  • Ensure code quality, performance, and scalability of data pipelines and Spark jobs.

Skills Required
Python, Pyspark, Databricks, Apache Spark, Etl

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application