Program Manager

Yantran LLC

📍 Austin, Texas, United States

Full-time other-general Posted February 06, 2026

Apply Now Similar Jobs

Job Description

                        We are mainly looking for a ML Engineer who is experienced and ready to take on this role. The candidate should have a strong background in ML and be capable of handling the tasks and responsibilities that come with the position.
ML Infrastructure
Performance Engineer
Focus:
This role focuses on the "serving plane." The engineer will integrate high-speed inference runtimes with streaming loaders and take ownership of the performance benchmarking mandate.
Key Responsibilities:
Integrate
SGLang
with the
Run:ai Model Streamer
to enable concurrent tensor streaming directly to GPU memory, reducing model "cold start" times.
Optimize SGLang s backend runtime, leveraging features like
RadixAttention
for prefix caching and compressed finite-state machines for faster decoding.
Design and execute rigorous
performance benchmarking
suites to identify bottlenecks in the inference stack and provide code-level "fixes" to improve t...
                    

Apply for this Position

Ready to join Yantran LLC? Click the button below to submit your application.

Submit Application

Job Details

Location

Austin, Texas, United States

Job Type

Full-time