Job Description

At Oracle Cloud Infrastructure (OCI), we build the future of the cloud for Enterprises as a diverse team of fellow creators and inventors. We act with the speed and attitude of a start-up, with the scale and customer-focus of the leading enterprise software company in the world.

Values are OCI’s foundation and how we deliver excellence. We strive for equity, inclusion, and respect for all. We are committed to the greater good in our products and our actions. We are constantly learning and taking opportunities to grow our careers and ourselves. We challenge each other to stretch beyond our past to build our future.

You are the builder here. You will be part of a team of really smart, motivated, and diverse people and given the autonomy and support to do your best work. It is a dynamic and flexible workplace where you’ll belong and be encouraged. 

OCI Network Availability is looking for a senior manager to lead a Networking Reliability Engineering team responsible for driving operational excellence within the OCI physical network, supporting AI/ML, GPU workloads in a broadly distributed, multi-tenant cloud environment.

Due to the nature of the space, this role is a “full-service” opportunity, where duties cover the breadth of working day to day with network engineers to build and sustain services and tooling while also working with our partner teams to make it easy for them to manage the OCI network. 

Career Level - M3

In this role you will:

  • Attract, develop & manage a team of highly skilled Network engineers 
  • Define and develop roadmaps to deliver operational efficiencies 
  • Establish & report on a body of metrics that define service availability 
  • Drive strategic technology initative to deliver and operatre HPC and AI/ML capabilities for our customers.
  • Solve difficult problems in distributed systems, infrastructure, and highly available services 
  • Working on improving and implementing OCI network monitoring and automation
  • Collaborate with other Engineering organizations to deliver a highly available service to our customers
  • Participate in on call rotation for managers
  • The right leader for this role will make all the difference for our organization, our product, and our customers. Are you able to provide direction and structure for your teams? Do you enjoy mentoring engineers? Are you able to take feedback and learn from engineers and leaders across a large organization? Do you thrive in a fast-paced environment, and want to be an integral part of a truly great team? Come join us! 

    Preferred Qualifications:

  • 5+ years of experience in large scale physical network reliability
  • 3+ years of experience in an engineering and operations management role
  • Strong technical knowledge in cloud networking, high performance computing, and GPU systems.
  • Experience in a technical leadership and management role
  • Experience driving hiring, onboarding new engineers and ongoing performance management
  • Excellent organizational, verbal, and written communication skills
  • Excellent judgment to influence product roadmap direction, features, and priorities
  • Career Level - M3

    Apply for this Position

    Ready to join ? Click the button below to submit your application.

    Submit Application