Job Description
Overview:Cvent is a leading meetings, events and hospitality technology provider with more than 4,800 employees and nearly 22,000 customers worldwide. Founded in 1999, the company delivers a comprehensive event marketing and management platform for event professionals and offers software solutions to hotels, special event venues and destinations to help them grow their group/MICE and corporate travel business.The DNA of Cvent is our people, and our culture has an emphasis on fostering intrapreneurship --a system that encourages Cventers to think and act like individual entrepreneurs and empowers them to take action, embrace risk, and make decisions as if they had founded the company themselves. We foster an environment that promotes agility, which means we don’t have the luxury to wait for perfection. At Cvent, we value the diverse perspectives that each individual brings. Whether working with a team of colleagues or with clients, we ensure that we foster a culture that celebrates differences and builds on shared connections
About the role:
Cvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems; moreover, have strong leadership skills, this is a great fit for you.
As Manager, SRE you will demonstrate both emerging and current technologies, methods, and processes contributing to the evolution of software deployment processes, enhancing security, reducing risk, and improving the overall end-user experience. As part of the Technology R&D Team, you will play an integral part in advancing DevOps maturity and be a part of a new culture of quality and site reliability. You will continually improve our CI/CD tools, processes, and procedures. You will also be responsible for regular reporting to Senior Technology Leaders and providing updates on organizational risk exposure and risk related issues.
In This Role, You Will: Set the direction and strategy for your team, and help shape the overall SRE program for the companySupport the growth by ensuring a robust, scalable, cloud-first infrastructureOwn site stability, performance and capacity planningParticipate early in the SDLC to ensure reliability is built in from the beginning, and creating plans for successful implementations/launchesFoster a learning and ownership culture within the team and the larger Cvent organizationEnsure best engineering practices through automation, infrastructure as code, robust system monitoring, alerting, auto scaling, self-healing, etc...Manage complex technical projects and a team of SREsRecruit and develop staff; build a culture of excellence in site reliability and automationLead by example – roll up your sleeves by debugging and coding; participate in on-call rotation & occasional travelRepresent the technology perspective and priorities to leadership and other stakeholders by continuously communicating timeline, scope, risks, and technical road map Here's What You Need:10+ years of hands-on technical leadership and people management experience3+ years of demonstrable experience leading site reliability and performance in large-scale, high-traffic environmentsStrong leadership, communication and interpersonal skills geared to getting things doneDeveloping themselves and the talent within their charge – fostering and creating opportunity for the teamArchitect-level understanding of one or more of the major public cloud services (AWS, GCP or Azure), using them to effectively design secure and scalable servicesStrong understanding of SRE concepts and the DevOps culture, with a focus on leveraging software engineering tools, methodologies and conceptsIn-depth understanding of automation and CI/CD processes to go along with excellent reasoning and problem-solving skillsExperience with Unix/Linux environments with a deep grasp on system internalsWorked on large-scale distributed systems including multi-tiered architectureStrong knowledge of modern platforms like Fargate, Docker, Kubernetes etc.Experience working with monitoring tools (Datadog, NewRelic, ELK stack, etc) and Database technologies (SQL Server, Postgres and Couchbase preferred)Validated breadth of understanding and development of solutions based on multiple technologies, including networking, cloud, database, and scripting languages.Experience in prompt engineering, building AI Agents, or MCP is a plus.Set the direction and strategy for your team, and help shape the overall SRE program for the companySupport the growth by ensuring a robust, scalable, cloud-first infrastructureOwn site stability, performance and capacity planningParticipate early in the SDLC to ensure reliability is built in from the beginning, and creating plans for successful implementations/launchesFoster a learning and ownership culture within the team and the larger Cvent organizationEnsure best engineering practices through automation, infrastructure as code, robust system monitoring, alerting, auto scaling, self-healing, etc...Manage complex technical projects and a team of SREsRecruit and develop staff; build a culture of excellence in site reliability and automationLead by example – roll up your sleeves by debugging and coding; participate in on-call rotation & occasional travelRepresent the technology perspective and priorities to leadership and other stakeholders by continuously communicating timeline, scope, risks, and technical road map10+ years of hands-on technical leadership and people management experience3+ years of demonstrable experience leading site reliability and performance in large-scale, high-traffic environmentsStrong leadership, communication and interpersonal skills geared to getting things doneDeveloping themselves and the talent within their charge – fostering and creating opportunity for the teamArchitect-level understanding of one or more of the major public cloud services (AWS, GCP or Azure), using them to effectively design secure and scalable servicesStrong understanding of SRE concepts and the DevOps culture, with a focus on leveraging software engineering tools, methodologies and conceptsIn-depth understanding of automation and CI/CD processes to go along with excellent reasoning and problem-solving skillsExperience with Unix/Linux environments with a deep grasp on system internalsWorked on large-scale distributed systems including multi-tiered architectureStrong knowledge of modern platforms like Fargate, Docker, Kubernetes etc.Experience working with monitoring tools (Datadog, NewRelic, ELK stack, etc) and Database technologies (SQL Server, Postgres and Couchbase preferred)Validated breadth of understanding and development of solutions based on multiple technologies, including networking, cloud, database, and scripting languages.Experience in prompt engineering, building AI Agents, or MCP is a plus.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application