Job Description

Inference Systems Engineer

Remote

Infrastructure / Serving Systems

$5,651 - $6,469/month USD

Role Overview

As an Inference Systems Engineer, you will own the serving runtime that powers production LLM inference. This is a deeply technical role focused on system performance and stability: optimizing request lifecycle behavior, streaming correctness, batching/scheduling strategy, cache and memory behavior, and runtime execution efficiency. You will ship changes that improve TTFT, p95/p99 latency, throughput, and cost efficiency while preserving correctness and reliability under multi-tenant load.

You will collaborate closely with platform/infrastructure operations, networking, and API/control-plane teams to ensure the serving system behaves predictably in production and can be debugged quickly when incidents occur. This role is for engineers who can reason about the entire inference pipeline, validate...

Apply for this Position

Ready to join Noxx? Click the button below to submit your application.

Submit Application