Job Description

Qualifications


  • 8 years of good hands-on exposure with Big Data technologies – pySpark (Data frame and SparkSQL), Hadoop, and Hive
  • Good hands-on experience of python and Bash Scripts
  • Good understanding of SQL and data warehouse concepts
  • Strong analytical, problem-solving, data analysis and research skills
  • Demonstrable ability to think outside of the box and not be dependent on readily available tools
  • Excellent communication, presentation and interpersonal skills are a must


Good to have:


  • Hands-on experience with using Cloud Platform provided Big Data technologies (i.e. IAM, Glue, EMR, RedShift, S3, Kinesis)
  • Orchestration with Airflow and Any job scheduler experience
  • Experience in migrating workload from on premises to cloud and cloud to cloud migrations


Roles & Responsibilities


  • Develop efficient ETL pipelines as per business requirements, following the development standards and best practices.
  • Perform integration testing of different created pipeline in AWS env.
  • Provide estimates for development, testing & deployments on different env.
  • Participate in code peer reviews to ensure our applications comply with best practices.
  • Create cost effective AWS pipeline with required AWS services i.e S3, IAM, Glue, EMR, Redshift etc.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application