Job Description

Responsibilities:

  • Create and maintain optimal data pipeline architecture; assemble large, complex data sets that meet functional / non-functional requirements.
  • Design the right schema to support the functional requirement and consumption patter.
  • Design and build production data pipelines from ingestion to consumption.
  • Create necessary preprocessing and postprocessing for various forms of data for training/ retraining and inference ingestions as required.
  • Create data visualization and business intelligence tools for stakeholders and data scientists for necessary business/ solution insights.
  • Identify, design, and implement internal process improvements: automating manual data processes, optimizing data delivery, etc.
  • Ensure our data is separated and secure across national boundaries through multiple data centers


Requirements and Skills

  • You should have a bachelors or master’s degree in computer science, Information Technology or other quantitative fields
  • You should have at least 8 years working as a data engineer in supporting large data transformation initiatives related to machine learning, with experience in building and optimizing pipelines and data sets
  • Strong analytic skills related to working with unstructured datasets.
  • Experience with Azure cloud services, ADF, ADLS, HDInsight, Data Bricks, App Insights etc
  • Experience in handling ETL’s using Spark.
  • Experience with object-oriented/object function scripting languages: Python, Pyspark, etc.
  • Experience with big data tools: Hadoop, Spark, Kafka, etc.
  • Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
  • You should be a good team player and committed for the success of team and overall project.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application