Job Description

Job Title: Apprentice Leader – Data Engineering

Experience: 4–7 years

Location: Bangalore (Onsite)

Employment Type: Full-Time

Overview

Our client – one of the largest data science companies – is seeking to hire a Data Engineer (Data Platform & DevOps) to design, build, and operate scalable data and ML platforms on AWS, Azure, or GCP. This role blends strong data engineering fundamentals with DevOps practices, supporting production-grade data pipelines, ML workflow orchestration, and analytics platforms. You’ll collaborate closely with data science, platform, and product teams to enable reliable, automated, and secure data and ML systems.

What You’ll Do

- Build & Operate Data Pipelines:

Design, develop, and maintain robust data pipelines for ingestion, transformation, and analytics using cloud-native and open-source technologies.

- ML Workflow Orchestration:

Orchestrate end-to-end machine learning workflows.

- DevOps & CI/CD for Data & ML:

Design and operate CI/CD pipelines using GitHub Actions and Jenkins to automate build, test, and deployment of data pipelines, Airflow DAGs, Databricks jobs, and ML services.

- Artifact & Dependency Management:

Manage Python packages, Docker images, and ML artifacts using JFrog Artifactory or similar artifact repositories.

- Cloud Platforms:

Design and operate data platforms using Databricks, AWS, Azure, or GCP, along with cloud services such as object storage, managed compute, and data warehouses.

- Infrastructure as Code:

Provision and manage infrastructure using Terraform or equivalent IaC tools, following best practices for security, scalability, and cost efficiency.

- Observability & Reliability:

Implement monitoring, logging, alerting, and data quality checks for data pipelines, ML workflows, and analytics jobs.

- Security & Governance:

Implement IAM, secrets management, and least-privilege access across cloud platforms and CI/CD pipelines.

- Cross-Functional Collaboration:

Partner with data scientists, ML engineers, analysts, and platform teams to productionalize and scale data and ML workloads.

What You’ll Bring

- Experience:

Experience as a Data Engineer, or Platform Engineer working on data-intensive systems with DevOps practices.

- Cloud Platforms:

Hands-on experience with one or more cloud platforms: AWS, Azure, or GCP.

- Databricks:

Experience building and operating data pipelines and analytics or ML workloads on Databricks, AWS, Azure, or GCP.

- Data Orchestration:

Strong experience with Apache Airflow/AWS Glue, Lambda, Step Functions/Azure Data Factory for orchestrating data and ML pipelines.

- CI/CD & DevOps:

Hands-on experience with GitHub Actions and/or Jenkins, including pipeline automation and release workflows.

- Artifact Management:

Experience with JFrog Artifactory (or similar tools) for managing artifacts and dependencies.

- Programming:

Strong proficiency in Python; experience with Bash or shell scripting.

- Containers & Packaging:

Experience with Docker; familiarity with Kubernetes (EKS, AKS, or GKE) is a plus.

- Data Technologies:

Strong SQL skills and understanding of data modeling and analytical data stores.

- Security & Compliance:

Knowledge of IAM, secrets management, and secure CI/CD and data pipeline practices.

- Collaboration:

Strong communication skills and ability to work effectively with cross-functional teams.

Nice to Have

- Streaming and event-driven architectures (Kafka, Kinesis, Pub/Sub, Event Hubs).
- ML platforms and tooling (MLflow, SageMaker, Azure ML, Vertex AI).
- Big data frameworks (Spark, PySpark).
- Data quality, lineage, and observability tooling.
- Familiarity with Informatica Power Center and CloudEra Data Platform.

Preferred Qualifications

- BE/B.Tech or M.Tech degree in Computer Science, Engineering, or related fields.
- Prior experience working on Data Engineering Projects related to Pharma and Clinical Domain.
- AWS/Azure/GCP certifications (Certified Data Engineer or Solutions Architect).

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application