Job Description

We are looking for an MLOps Engineer to join our team. In this role, you will bridge the gap between Data Science and Operations. We are not looking for a generalist software engineer; we need someone who has specifically applied DevOps principles to Machine Learning—building CI/CD pipelines for models, managing Kubernetes clusters for training/inference, and who speaks the language of Data Scientists (PyTorch, Tensors, GPUs).


About SatSure

SatSure is a deep tech, decision Intelligence company that works primarily at the nexus of agriculture, infrastructure, and climate action, creating an impact for the other millions, focusing on the developing world. We want to make insights from Earth observation data accessible to all.

Join us to be at the forefront of building a deep tech company from India that solves problems for the globe.


Roles & Responsibilities

  • ML Pipelines: Design and build CI/CD pipelines specifically for ML workflows (training triggers, model versioning, testing) using tools like Jenkins, Bitbucket, or GitHub Actions.
  • Orchestration: Deploy, configure, and optimize Kubernetes clusters to support containerized deep learning applications (managing GPU resources, node scaling).
  • Model Serving: Work with Data Scientists to containerize and deploy PyTorch models using Docker and serving frameworks (KServe, Nvidia Triton Inference Server).
  • Infrastructure: Manage cloud infrastructure (AWS) for data processing and model storage (S3, ECR, IAM).
  • GitOps: Implement GitOps practices to manage the lifecycle of both infrastructure and ML configurations.
  • Monitoring: Implement monitoring for both system health (CPU/Memory) and Model Drift/Performance using tools like Prometheus, Grafana, or ELK.
  • Automation: Automate repetitive tasks related to dataset management and environment setup using Python.


Qualification

  • 1 - 3 years of relevant experience as MLEngineer, MLOps, or Platform Engineering roles.
  • Mandatory: Functional understanding of Machine Learning/Deep Learning concepts and the PyTorch framework.
  • Mandatory: Prior experience working with Kubernetes and CI/CD in a production environment.
  • Bachelor’s degree in Computer Science, IT, or a related field; non-IT degrees with relevant experience are also acceptable


Must-have Skills

  • Core MLOps: Practical experience deploying ML/DL models in production systems. You understand the difference between deploying a web app and deploying a deep learning model.
  • Kubernetes: Strong hands-on experience with K8s (deployments, services, ingress) and preferably experience scheduling GPU workloads.
  • CI/CD & GitOps: Proficiency in building pipelines (Jenkins/Bitbucket) and understanding GitOps workflows (ArgoCD/Flux).
  • ML Fundamentals: Working knowledge of PyTorch and Python. You should be able to read model code, understand training/inference loops, optimize pytorch models and debug environment issues (CUDA, dependencies).
  • Containerization: Expert-level Docker skills (multi-stage builds, reducing image sizes for large ML dependencies).
  • Cloud: Experience with AWS services (EC2, S3, ECR).
  • Linux/Scripting: Strong command of Linux internals and shell scripting.


Good-to-have

  • Experience with ML workflow tools like KServe, Triton Inference Server and MLflow.
  • Experience profiling and optimizing PyTorch models for production inference on accelerator platforms such as NVIDIA GPUs, TPUs, and AWS Inferentia
  • Background in processing Geospatial or Remote Sensing data.


Competencies

  • Technical Translator: Ability to understand requirements from Data Scientists and translate them into robust infrastructure components.
  • Debugging: Excellent troubleshooting skills for complex distributed systems (e.g., debugging why a pod crashed during inference).
  • Collaboration: Strong communication skills to work effectively within a cross-functional team.


Benefits

  • Medical Health Cover for you and your family, including unlimited online doctor consultations.
  • Access to mental health experts for you and your family.
  • Dedicated allowances for learning and skill development.
  • Comprehensive leave policy with casual leaves, paid leaves, marriage leaves, and bereavement leaves.
  • Twice a year appraisal.


Interview Process

  • Intro call
  • Assessment (Focus on Kubernetes/Docker/ML Deployment)
  • Interview rounds (ideally up to 3 rounds)
  • Culture Round / HR round

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application