Job Description

Technical Skills Required:

  • At least Intermediate level in AWS ETL (Glue, lambda, Batch, EMR) & API Development(Fast API + Core python), Redshift
  • AWS Redshift:
  • Worked on creating provisioned clusters
  • Expert knowledge of Workload management
  • Materialized View, DDL Optimizations, Json handling
  • AWS Glue to Redshift Loads
  • Row & Column Level Security
  • AWS Glue (ETL, Jobs, Crawlers, Catalogue including Iceberg, Workflows, Dynamic Data Frames)
  • Worked on Batch & Stream workloads
  • Performance Tuning & Cost Optimization
  • IAM, KMS, Secret Manager and Fine-Grained Access Control (Encryption good to have)
  • PySpark, Python
  • Lambda, Amazon S3, Athena, RDS
  • Apache Parquet, JSON, CSV
  • Data Lake Design & Implementation


Good to have:

  • CI/CD (Terraform or AWS CDK or CloudFormation)
  • Data Lineage & Governance
  • Kinesis/Kafka/Glue Streaming/AWS Batch


Key Responsibilities:


  • Collaborate with business stakeholders to analyze data requirements and define ETL & API architecture aligned with business goals.
  • Design and implement scalable ETL workflows using AWS Glue and PySpark for ingesting structured and semi-structured data into the AWS data lake.
  • Develop reusable Glue jobs and crawlers for automated metadata cataloging and data transformations.
  • Optimize Glue job performance using dynamic frame partitioning, job bookmarking, and parallelism tuning.
  • Integrate Glue with other AWS services such as S3, Athena, Redshift, Lambda, and CloudWatch for end-to-end orchestration and monitoring.
  • Lead data lakehouse implementation leveraging Glue with Iceberg for versioned, transactional data storage.
  • Ensure secure access to datasets using fine-grained IAM policies and Lake Formation(Good to Have).
  • Mentor junior engineers, enforce coding best practices, and participate in code reviews and architectural discussions.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application