Job Description

Key Responsibilities

Design, develop, and maintain end-to-end data engineering pipelines on AWS


Build and optimize ETL/ELT pipelines using Python and PySpark


Work with structured and semi-structured data using SQL


Ingest data from multiple sources (APIs, databases, streaming, files)


Perform data cleaning, transformation, and validation


Implement data quality checks and monitoring


Optimize data processing for performance and cost


Collaborate with data scientists, analysts, and stakeholders


Support the entire data engineering lifecycle: ingestion → processing → storage → analytics


Mandatory Technical Skills

AWS: S3, Glue, EMR, Lambda, Redshift, Athena (mandatory)


Programming: Python, PySpark


Databases: Strong SQL (PostgreSQL / MySQL / Redshift)


Data Processing: Batch and streaming data pipelines


ETL Tools: AWS Glue / Spark-based frameworks


Good to Have

Experience with Airflow or AWS Step Functions


Knowledge of Delta Lake / Iceberg / Hudi


Exposure to CI/CD for data pipelines


Experience with data warehousing and analytics


Education

B.E / B.Tech / M.Tech / MCA or equivalent

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application