Job Description
Site Reliability Engineer_ Digital Power
Monitor system health, identify performance bottlenecks, and troubleshoot production issues as they arise. Configure and maintain monitoring tools to track application performance and analyze logs and metrics to identify optimization opportunities.
Implement, and manage scalable, reliable systems to meet uptime and performance objectives.
Automate the provisioning, scaling, and deployment of infrastructure using tools such as Kubernetes, Docker.
Serve as a liaison between development and operations teams to ensure applications are built with reliability and scalability in mind.
Conduct root cause analysis after incidents and implement fixes to prevent future occurrences.
Job Requirements
Experience in using Linux system instructions. Practical experience with containerization technologies (Docker, Kubernetes).
Familiarity with database management systems (SQL) and experience with ...
Apply for this Position
Ready to join Confidential? Click the button below to submit your application.
Submit Application