Job Description
Requirements
Must have:
- Strong foundational knowledge in system architecture or computer architecture, operating systems, and runtime environments - Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling - Familiarity with vLLM, SGLang, Ray Serve, etc. - Understanding of common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation - Proficient in using Profiling/Tracing tools - Experienced in analyzing and optimizing system-level bottlenecks regarding GPU utilization, memory/bandwidth, Interconnect Fabric, and network/storage paths - Proficient in at least one system-level language (e.g., C/C , Go, Rust) - Proficient in one scripting language (e.g., Python)
Responsibilities:
- Design a unified AI Infra & Serving architecture platform for composite AI ...
Apply for this Position
Ready to join Microtech Global Ltd? Click the button below to submit your application.
Submit Application