Job Description

Requirements

Must have:

- Strong foundational knowledge in system architecture or computer architecture, operating systems, and runtime environments - Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling - Familiarity with vLLM, SGLang, Ray Serve, etc. - Understanding of common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation - Proficient in using Profiling/Tracing tools - Experienced in analyzing and optimizing system-level bottlenecks regarding GPU utilization, memory/bandwidth, Interconnect Fabric, and network/storage paths - Proficient in at least one system-level language (e.g., C/C , Go, Rust) - Proficient in one scripting language (e.g., Python)

Responsibilities:

- Design a unified AI Infra & Serving architecture platform for composite AI ...

Apply for this Position

Ready to join Microtech Global Ltd? Click the button below to submit your application.

Submit Application