Job Description

We are seeking a highly skilled and motivated Big Data Engineer to join our dynamic team. The ideal candidate will have extensive experience in designing, developing, and optimizing scalable data solutions using the Hadoop ecosystem, with a strong focus on PySpark and Hive. This role is crucial for building robust ETL pipelines, ensuring data quality, and driving performance improvements across our Big Data initiatives.

Key Responsibilities

  • Design, develop, and maintain efficient and scalable Big Data solutions using PySpark, Apache Hive, and Hadoop ecosystem tools (, Sqoop).

  • Should have strong Python knowledge

  • Implement and optimize ETL (Extract, Transform, Load) processes and data warehousing solutions, including Fact, Dimension, and Slowly Changing Dimensions (SCD-2).

  • Conduct in-depth data analysis, troubleshoot complex data issues, and ensure the accuracy, reliability, and integrity of data....

  • Apply for this Position

    Ready to join Citi? Click the button below to submit your application.

    Submit Application