Job Description

Role Overview

We are seeking a Senior Data / Streaming Engineer with 5–7 years of experience to design, build, and operate scalable real-time and batch data processing platforms. The role focuses on stream processing, cloud-native data pipelines, and analytics systems running on AWS.

You will work closely with data, platform, and product teams to deliver reliable, high-performance data solutions supporting real-time analytics, monitoring, and downstream consumption.

Key Responsibilities

  • Design, develop, and maintain real-time stream processing applications using Apache Flink / PyFlink and Spark, including state management and event-time processing with watermarks.
  • Build and optimize Python-based data pipelines using libraries such as Pandas, Polars, boto3, and PyArrow for data transformation and integration.
  • Implement and manage Kafka-based streaming architectures (Apache Kafka / AWS MSK), including topic design, partitioning, and consumer/producer optimization.
  • Develop and operate cloud-native data platforms on AWS, leveraging services such as S3, Managed Flink, CloudWatch, MSK, and IAM.
  • Write and optimize SQL-based transformations using Flink SQL, ensuring efficient query execution and scalable data processing.
  • Store, query, and analyze large datasets using ClickHouse, and build Grafana dashboards for observability, analytics, and system monitoring.
  • Orchestrate batch and streaming workflows using Apache Airflow, including DAG design, scheduling, and operational monitoring.
  • Containerize applications using Docker and support deployments on Kubernetes, following best practices for scalability and resilience.
  • Collaborate with DevOps, platform, and analytics teams to improve system reliability, performance, and cost efficiency.
  • Participate in code reviews, technical design discussions, and production support activities.

Required Skills & Experience

  • 5–7 years of professional experience in data engineering, streaming platforms, or distributed systems.
  • Strong hands-on experience with Apache Flink / PyFlink and/or Apache Spark for stream and batch processing.
  • Proficient in Python for data engineering and automation (Pandas, Polars, boto3, PyArrow).
  • Solid experience with Apache Kafka or AWS MSK, including streaming concepts such as partitions, offsets, and consumer groups.
  • Strong understanding of AWS cloud services, particularly S3, MSK, Managed Flink, CloudWatch, and IAM.
  • Advanced SQL skills, including data transformation and query optimization (Flink SQL preferred).
  • Experience with ClickHouse or similar OLAP databases, and Grafana for dashboards and monitoring.
  • Working knowledge of Docker and Kubernetes fundamentals.

  • Experience with Apache Airflow for pipeline orchestration and scheduling.
  • Good understanding of distributed systems, fault tolerance, and performance tuning.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application