Job Description
Senior Machine Learning Operations Engineer
Job Summary:
This is a mid-level role on the Innovation Analytics & AIteam, responsible for managing the full machine learning lifecycle, including model deployment, versioning, and performance monitoring, to ensure the reliability and scalability of production models. This role collaborates with Solution Architects, data scientists, and cross-functional teams to design and implement robust Lops pipelines and infrastructure aligned with enterprise architecture. The role applies design patterns and engineering best practices to build maintainable, testable, and extensible systems, while integrating structured and unstructured data sources using tools like Databricks, MLFlow, Azure ML, and CI/CD pipelines. This position supports operational excellence, leads innovation efforts, and mentors junior team members to promote modern MLOps practices, continuous delivery, and technical excellence. Serves as a subject matter expert, contributing to strategic initiatives and fostering a culture of curiosity and collaboration.
Essential Functions& Responsibilities:
Description:
Model Lifecycle Management & MLOps Pipeline Development - Manage the full machine learning lifecycle, including model development, versioning, deployment, and performance monitoring, ensuring reliability and scalability of production models. Design and implement MLOps pipelines using tools like Databricks, MLFlow, Azure ML, and CI/CD systems to automate testing, tracking, and model rollout. Write clean, testable code in Python and R using libraries such as TensorFlow, PyTorch, and scikit-learn, and apply NLP techniques and transformer-based models like GPT in practical scenarios. Integrate structured and unstructured data sources to support robust AI/ML workflows in enterprise environments.
Operational Excellence & System Support:
- Identify and resolve recurring system and data issues, contributing to long-term solutions that enhance reliability and reduce incident frequency through root cause analysis and corrective actions.
- Collaborate with developers, data analysts, data scientists, DBAs, and Infrastructure teams to troubleshoot issues, improve data interaction practices, and ensure proper configuration for scalable Data Analytics solutions.
- Monitor system performance, document findings, and contribute to optimization efforts to support high availability and operational efficiency.
Innovation & Technology Adoption:
- Stay at the forefront of industry trends in MLOps, LLMOps, and AI, driving the adoption of cutting-edge technologies and integrating innovative solutions into enterprise AI/ML strategies.
- Participate in the evaluation and integration of modern tools and platforms, supporting strategic initiatives like architecture modernization, test automation, and application security in collaboration with Solution Architects.
- Contribute to internal knowledge-sharing sessions and act as a subject matter expert in modern platforms to foster technical excellence and curiosity.
Mentorship & Team Leadership:
- Mentor junior team members, participate in design reviews, and promote the adoption of modern MLOps tools and practices to support continuous delivery and innovation.
- Share insights and learnings with peers, lead internal learning forums, and encourage cross- functional collaboration to build a culture of technical growth.
- Contribute to backlog prioritization, technical documentation, and the development of scalable MLOps processes to ensure smooth model delivery workflows.
Cross-Functional Collaboration & Strategic Initiatives:
- Partner with Solution Architects and cross-functional teams to plan and deliver strategic initiatives such as application security, architecture modernization, and test automation.
- Support the transition of prototype models into production by working closely with data scientists and ensuring infrastructure readiness and observability for AI/ML systems. Build subject matter expertise in modern platforms and tools, influencing broader technology adoption across teams.
Skills:
- Proficiency in Python and R programming (TensorFlow, PyTorch, scikit-learn)
- Expertise with MLOps tools (MLFlow, Azure ML, Databricks) and CI/CD pipelines Experience with NLP techniques and transformer-based models (e.g., GPT)
- Knowledge of infrastructure readiness, observability, and continuous improvement for AI/ML systems
- Ability to design and implement scalable MLOps pipelines and workflows Strong collaboration, mentorship, and leadership skills
- Familiarity with enterprise initiatives (application security, architecture modernization, test automation) Commitment to staying current with trends in MLOps, LLMOps, and AI
Experience:
- Minimum Years of Experience Required to Perform Essential Job Functions
- Additional Experience Qualifier (optional) Perform Essential Job Functions
- Experience with MLOps pipelines, model deployment, and performance monitoring in production environments.
Minimum Qualifications:
- Except where required by licensure or regulation a combination of comparable education and experience may be used to satisfy qualification requirements.
Education:
- Minimum Education Required to Perform Essential Job Functions: 4 Year / bachelor's degree
Specific Degree, if required (ex - Engineering, Juris Doctorate, etc):
- Bachelors Degree, preferably in Computer Science, Data Science, Machine Learning, or equivalent experience
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application