Job Description
Roles & Responsibilities:
- Develop and maintain data pipelines and workflows using PySpark on Databricks
- Write Python scripts for data processing, transformation, and integration with Spark
- Utilize PySpark and related libraries (e.g., pyspark-ai) to incorporate AI and machine learning functionalities into Spark workflows
- Perform data analysis, transformations, and aggregations on large datasets
- Deploy and optimize machine learning models within the Databricks environment
- Collaborate with data engineers, data scientists, and business teams to understand requirements and deliver solutions
- Use the Databricks platform to manage, monitor, and optimize Spark jobs and workflows
- Work with the English SDK for Apache Spark to facilitate natural language interactions with Spark DataFrames
- Ensure code quality, performance, and scalability of data pipelines and Spark jobs.
Skills Required
Python, Pyspark, Databricks, Apache Spark, Etl
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application