Job Description
Qualifications
- 8 years of good hands-on exposure with Big Data technologies – pySpark (Data frame and SparkSQL), Hadoop, and Hive
- Good hands-on experience of python and Bash Scripts
- Good understanding of SQL and data warehouse concepts
- Strong analytical, problem-solving, data analysis and research skills
- Demonstrable ability to think outside of the box and not be dependent on readily available tools
- Excellent communication, presentation and interpersonal skills are a must
Good to have:
- Hands-on experience with using Cloud Platform provided Big Data technologies (i.e. IAM, Glue, EMR, RedShift, S3, Kinesis)
- Orchestration with Airflow and Any job scheduler experience
- Experience in migrating workload from on premises to cloud and cloud to cloud migrations
Roles & Responsibilities
- Develop efficient ETL pipelines as per business requirements, following the development standards and best practices.
- Perform integration testing of different created pipeline in AWS env.
- Provide estimates for development, testing & deployments on different env.
- Participate in code peer reviews to ensure our applications comply with best practices.
- Create cost effective AWS pipeline with required AWS services i.e S3, IAM, Glue, EMR, Redshift etc.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application