Job Description
Job Summary
We are seeking a skilled Site Reliability Engineer (SRE) to enhance the reliability, scalability, and performance of our systems and applications. The ideal candidate will have strong experience in automation, cloud platforms, observability, incident management, and DevOps practices. This role involves working closely with cross‑functional teams to ensure high availability, continuous improvement, and efficient service delivery.
Key Responsibilities
- Design, build, and maintain automation for infrastructure provisioning and configuration management.
- Implement and manage monitoring, observability, and alerting systems to ensure service reliability.
- Collaborate with development and operations teams to enhance CI/CD pipelines and deployment automation.
- Lead incident response, root‑cause analysis, and continuous improvement initiatives.
- Manage cloud infrastructure, container orchestration...
Apply for this Position
Ready to join TechDoQuest? Click the button below to submit your application.
Submit Application