Job Description

Job Description
As a Senior Site Reliability Engineer (SRE), you will be responsible for the reliability, scalability, and observability of our Dev Ops ecosystem. This includes CI/CD systems, Kubernetes clusters, infrastructure automation, and telemetry platforms. You will work closely with development, QA, and operations teams to build resilient systems and ensure continuous improvement of reliability standards.
Key Responsibilities:
- Own and manage Dev Ops components and tooling across 100+ production environments.
- Administer, scale, and optimize Kubernetes clusters used for application and infrastructure workloads.
- Implement and maintain observability stacks including Prometheus, Open Telemetry (OTel), Elasticsearch, and Click House for metrics, tracing, and log analytics.
- Ensure high availability of CI/CD pipelines and automate infrastructure provisioning using Terraform and Ansible.
- Build alerting, monitoring, and dashboarding systems to proactively d...

Apply for this Position

Ready to join Qualys? Click the button below to submit your application.

Submit Application