Job Description
**Introduction**
At IBM work is more than a job - it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, let’s talk.
**Your role and responsibilities**
We are looking for a highly motivated PhD or MSc student to join our team for a summer internship focused on cost-efficient serving of large-scale AI inference workloads.
The internship will explore advanced routing strategies and KV-cache-aware optimizations in distributed inference systems, with an emphasis on improving performance, scalability, and GPU cost efficiency.
What you will work on
* Designing and evaluating routing algorithms to optimize inference latency, throughput, and cost
* Investiga...
At IBM work is more than a job - it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, let’s talk.
**Your role and responsibilities**
We are looking for a highly motivated PhD or MSc student to join our team for a summer internship focused on cost-efficient serving of large-scale AI inference workloads.
The internship will explore advanced routing strategies and KV-cache-aware optimizations in distributed inference systems, with an emphasis on improving performance, scalability, and GPU cost efficiency.
What you will work on
* Designing and evaluating routing algorithms to optimize inference latency, throughput, and cost
* Investiga...
Apply for this Position
Ready to join IBM? Click the button below to submit your application.
Submit Application