Job Description
תחום:
Ensure cloud platform and services are designed to support and align with our overall business strategies and priorities. Activities will include building out service capabilities to match SLA requirements, capacity modeling for scale and cost, and security management.Monitor, manage and operate our cloud service. Scale our service with required monitoring and alerting capabilities, and develop incident management, and security and compliance activities/processes.Manage the Service Operations team to operate with a culture of business and customer centricity by maintaining; SLA for each service, including incident response, problem management, and service upgrades.Develop and drive, as the primary owner, the communication strategy for internal and external stakeholders (including customers) to convey service health, tracking against SLAs, current and historical incidents, upcoming events or upgrades.Ensure all technical procedures are documented, reviewed and updated and actively contribute to the maintenance of operational standards & policiesCollaborate with the internal R&D team to automate infrastructure services and system administration tasks wherever possible and implement a monitoring strategy to provide rapid feedback and diagnostics in the event of a service disruption.Create relationships with other departments,, to make sure we provide services with high availability and superior performance for all our customers.Provide technical leadership, coaching and mentoring for the team you build, fostering a culture of accountability, innovation and team building.
דרישות:
At least 5 years of relevant industry experience in maintaining a high availability production environmentAt least 3 years of experience with service operations and extensive knowledge of cloud infrastructure planning and operations, design and deployment, as well as system life cycle management in supporting a SaaS infrastructureAt least 3 years of team management experience in a cloud operations or customer support environmentsSolid understanding of Networking/VPCs/monitoring & alerting frameworks and toolsSubstantial experience in operating a high-availability cloud infrastructureExperience with cloud platforms like Azure or AWSExperience with running distributed systems deployed multiple geographies across the globeKnowledge of security practices, tooling and automationExperience with monitoring tools such as DataDog, New Relic, Grafana, Prometheus,Experience with automationtools such as Anisble, TerrafformAdvanced knowledge of at least one scripting language such as Python or PowerShellExperience with CI/CD toolslike Jenkins, Octopus or VSTSSome experience with relational database systems like SQL
איזור: גוש דן
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application