Job Description
About the role:
The MLOps Support Engineer is an operations-first role, focused on ensuring AI/ML systems remain stable, observable, and supportable in production environments. This is not a data science or feature development role.
The primary objective is to maintain continuous performance of ML models and associated pipelines with minimal disruption to both internal and client-facing services. You will provide Tier 1 and Tier 2 support, escalating to Tier 3 Engineering as needed.
What you’ll do:
- Provide Tier 1 / Tier 2 operational support for AI/ML solutions.
- Identify failed jobs, degraded pipelines, or performance anomalies.
- Triage incidents, investigate issues, and coordinate escalation to Tier 3 Engineering.
- Participate in on-call rotas once established.
- Validate that pipelines and jobs complete successfully.
- Monitor data pipeline health, model execution, and basic performance metric...
Apply for this Position
Ready to join CloudFactory? Click the button below to submit your application.
Submit Application