Job Description

We are at a critical inflection point. Our low-code platform is preparing for an immediate scale-up to 3,000,000 concurrent users. We currently operate on a GKE-based architecture with 78 microservices and a MongoDB Atlas backend. We need a Lead Site Reliability Engineer who can transform our current synchronous system into a high-concurrency, asynchronous engine capable of surviving massive traffic spikes without database or compute failure.

Responsibilities

● Decoupled Architecture: Transition synchronous API flows to Google Cloud Pub/Sub to act as a shock absorber for a MongoDB Atlas M60+ cluster.

● Database Guardrails: Implement and own the Speed Limit for our database. You will configure Subscriber-side Flow Control in Node.js and Kubernetes HPA to ensure we never exceed 10,000 IOPS or 32k connections.

● Resource Isolation: Isolate heavy Puppeteer/Chrome workloads from core platform services using Cloud Run or dedicated Spot VM node pool...

Apply for this Position

Ready to join Odixcity Consulting? Click the button below to submit your application.

Submit Application