Job Description

Position Description:

Senior Platform Engineer (Azure, AKS & Kafka)

Role Overview
We are looking for a highly skilled Platform Engineer to design, automate, and scale our cloud infrastructure. In this role, you will be the backbone of our development ecosystem, managing our Azure Kubernetes Service (AKS) environments, optimizing our Kafka streaming clusters, and ensuring our Hasura/PostgreSQL data layers are performant and secure.

Key Responsibilities
1. Infrastructure & Kubernetes Orchestration

. Cluster Management: Provision and maintain AKS and ARO environments; optimize resource allocation (CPU/Memory) and drive cloud cost-efficiency.
. Networking: Manage complex networking requirements, including BGP configurations and secure connectivity.
. IaC & Automation: Use Terraform/Terramate and GitHub Actions to automate the provisioning of Kafka clusters and Azure resources.
. Secure Connectivity: Expert at configuring Azure Private Link and Private Endpoints to ensure Kafka and Database traffic remains off the public internet.
. Advanced AKS Networking: Hands-on experience with Azure CNI for pod networking and managing Internal Load Balancers for high-availability traffic routing.
. Traffic Governance: Mastery of Network Security Groups (NSGs) and Kubernetes Network Policies (Cilium/Azure NPM) to enforce Zero-Trust security.
. Hybrid Routing: Deep understanding of VNET Peering and BGP for low-latency connectivity between AKS clusters and Kafka brokers.
. Load Balancing: Expertise in configuring Azure Application Gateway (WAF) for L7 ingress and Azure Internal Load Balancers for L4 Kafka bootstrap connectivity across Availability Zones.
. DNS & Service Discovery: Proven experience managing Azure Private DNS Zones for internal name resolution and integrating ExternalDNS with AKS to automate record management.
. Traffic Management: Ability to manage CoreDNS within Kubernetes to optimize service discovery and prevent lookup latency for high-throughput Kafka producers.

2. Kafka Administration & Data Ops

. Stream Management: Act as the primary administrator for Kafka ,managing brokers, partitions, and ACLs.
. Performance Tuning: Independently debug, optimize, and implement Kafka solutions based on evolving business needs.
. Data Layer: Manage and scale PostgreSQL databases and the Hasura GraphQL engine to ensure seamless API delivery.

3. CI/CD & Observability

. Pipeline Engineering: Design and administer robust Jenkins and GitHub Actions pipelines to support continuous integration and deployment.
. Enterprise Monitoring: Maintain full-stack visibility using Grafana, Prometheus, OpenSearch, and Kibana,Dynatrace.
. Troubleshooting: Lead "random investigations" and root-cause analysis to enhance system reliability and performance.


Required Skills & Qualifications

. Experience: 8+ years in Azure, AKS, and DevOps with deep specialization in Kafka Administration.
. Cloud Mastery: Strong hands-on experience with Azure Cloud Services; familiarity with GKE is a plus.
. Tooling: Expert knowledge of Terraform, Jenkins, and GitHub Actions.
. Observability: Proficiency in Grafana/Prometheus and the ELK/OpenSearch stack. Exposure to Dynatrace is highly valued.
. Data Savvy: Experience with GraphQL (GQL) and managing relational databases (PostgreSQL).
. Mindset: Ability to work independently in a fast-paced environment and collaborate effectively with developers and stakeholders.


Good-to-Have

. Certification: Kafka Certified Engineer.
. Development: some knowledge with Java and building REST/OpenAPI modern APIs.

Skills:

  • Apache Kafka
  • DevOps
  • GitHub
  • Kubernetes
  • Terraform
  • Apply for this Position

    Ready to join ? Click the button below to submit your application.

    Submit Application