Job Description
Responsibilities:
- Create and maintain optimal data pipeline architecture; assemble large, complex data sets that meet functional / non-functional requirements.
- Design the right schema to support the functional requirement and consumption patter.
- Design and build production data pipelines from ingestion to consumption.
- Create necessary preprocessing and postprocessing for various forms of data for training/ retraining and inference ingestions as required.
- Create data visualization and business intelligence tools for stakeholders and data scientists for necessary business/ solution insights.
- Identify, design, and implement internal process improvements: automating manual data processes, optimizing data delivery, etc.
- Ensure our data is separated and secure across national boundaries through multiple data centers
Requirements and Skills
- You should have a bachelors or master’s degree in computer science, Information Technology or other quantitative fields
- You should have at least 8 years working as a data engineer in supporting large data transformation initiatives related to machine learning, with experience in building and optimizing pipelines and data sets
- Strong analytic skills related to working with unstructured datasets.
- Experience with Azure cloud services, ADF, ADLS, HDInsight, Data Bricks, App Insights etc
- Experience in handling ETL’s using Spark.
- Experience with object-oriented/object function scripting languages: Python, Pyspark, etc.
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- You should be a good team player and committed for the success of team and overall project.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application