Job Description
Overview
What You'll Achieve:
What You'll Achieve:
- Design, develop, and maintain scalable data processing solutions across on-premises and cloud environments using Python and Apache Spark.
- Optimize and fine-tune Spark jobs for performance, including resource utilization, shuffling, partitioning, and caching, to ensure maximum efficiency in large-scale big data environments.
- Design and implement scalable, fault-tolerant data pipelines with end-to-end monitoring, alerting, and logging.
- Leverage AWS cloud services (2+ years preferred) to build and manage data pipelines and distributed processing workloads.
- Develop and optimize SQL queries across relational and data warehouse systems (RDBMS/Data Warehouse).
- Apply design patterns and best practices for efficient data modeling, partitioning, and distributed system performance.
- Use Git (or equivalent) for source control and maintain strong unit and integ...
Apply for this Position
Ready to join Epsilon? Click the button below to submit your application.
Submit Application