Job Description
Responsibilities
Design, develop and deploy a highly available, reliable and scalable cloud infrastructure platform and services. The job scopes involve implementing effective infra solutions for cloud networks, storages, virtual machines, infra security and disaster recovery.Develop and maintain infrastructure using infra-as-code tools like Terraform or Ansible to ensure repeatable, automated and version-controlled deployments.Build in-systems telemetry to analyse and optimise their performance and reliability.Implement security measures to ensure infrastructure meets organisation security standards and compliance.Troubleshoot cloud infrastructure incidents to identify root cause and implement resolutions. Define, implement and track SRE metrics, including SLO, SLI and error budgets to improve cloud systems reliability.Collaborate with developers and infra stakeholders to understand their needs.Perform capacity planning to ensure that the infrastructure is scalable for future demands.Requirements (Minimum Qualifications)
Background in Computer Science, Computer or Electrical Engineering, Information Technology or a STEM qualification with relevant experienceKnowledge in IT Infrastructure (i.e. networks, systems, storage) and Infra OperationsProficient in infra-as-code and scripting tools like Ansible, Terraform, Linux shell scripts, Powershell, etc.Understanding of public cloud services like AWS, Azure or GCP.System operations and administration experience on enterprise systems.At least 2 years of relevant working experience, including scripting and programming.Nice-to-haves
Cloud certifications, such as AWS, GCP certificates.Familiar with metrics/logging systems such as Elastic Stack and Prometheus/Grafana.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application