Job Description
You will be part of a dynamic team responsible for exploring, designing, managing and optimising our on-premised cloud infrastructure platforms and services. You will be working with a team of cloud infrastructure engineers, responsible for implementing cloud networking, storages, virtual machines and infra security solutions. You must have a good understanding of cloud infra technologies, architecture and site reliability engineering (SRE).
Responsibilities
- Design, develop and deploy a highly available, reliable and scalable cloud infrastructure platform and services. The job scopes involve implementing effective infra solutions for cloud networks, storages, virtual machines, infra security and disaster recovery.
- Develop and maintain infrastructure using infra-as-code tools like Terraform or Ansible to ensure repeatable, automated and version-controlled deployments.
- Build in-systems telemetry to analyse and optimise their performance and reliability.
- Implement security measures to ensure infrastructure meets organisation security standards and compliance.
- Troubleshoot cloud infrastructure incidents to identify root cause and implement resolutions.
- Define, implement and track SRE metrics, including SLO, SLI and error budgets to improve cloud systems reliability.
- Collaborate with developers and infra stakeholders to understand their needs.
- Perform capacity planning to ensure that the infrastructure is scalable for future demands.
Requirements (Minimum Qualifications)
- Background in Computer Science, Computer or Electrical Engineering, Information Technology or a STEM qualification with relevant experience
- Knowledge in IT Infrastructure (i.e. networks, systems, storage) and Infra Operations
- Proficient in infra-as-code and scripting tools like Ansible, Terraform, Linux shell scripts, Powershell, etc.
- Understanding of public cloud services like AWS, Azure or GCP.
- System operations and administration experience on enterprise systems.
- At least 2 years of relevant working experience, including scripting and programming.
Nice-to-haves
- Cloud certifications, such as AWS, GCP certificates.
- Familiar with metrics/logging systems such as Elastic Stack and Prometheus/Grafana.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application