Job Description
About the role At SciSpace (Formerly Typeset), we're using language models to automate and streamline research workflows from start to finish. And the best part? We're already making waves in the industry, with a whopping 4.5 million users on board as of January 2024! Our users love us too, with a 40% MOM retention rate and 10% of them using our app more than once a week! We're growing by more than 50% every month, all thanks to our awesome users spreading the word (see it yourself on Twitter). And with almost weekly feature launches since our inception, we're constantly pushing the boundaries of what's possible. Our team of experts in design, front-end, full-stack engineering, and machine learning is already in place, but we're always on the lookout for new talent to help us take things to the next level. Our user base is super engaged and always eager to provide feedback, making Scispace one of the most advanced applications of language models out there.
What you’ll do
- Build agentic workflows for research tasks (e.g., literature review assistance, claim checking, summarization with citations, writing assistance) using tool-use, orchestration, and memory patterns. (Experience with LangGraph is a big plus.)
- Design and improve RAG systems end-to-end for scholarly content (papers, citations, metadata, figures/tables where relevant): chunking strategies, embeddings, retrieval/reranking, query rewriting, context selection, grounding, attribution/citations, and fallbacks.
- Develop evaluation strategies and harnesses for GenAI features: offline/online evals, golden sets, rubric-based evaluation, human-in-the-loop workflows, LLM-as-judge (with calibration), and regression testing for reliability.
- Experiment with fine-tuning when needed (SFT, preference tuning), and contribute to RLHF-style approaches or reward-model-driven iteration. (RLHF experience is a big plus.)
- Collaborate closely with product and engineering to translate user pain points into measurable improvements, ship iteratively, and make tradeoffs with clarity.
- Communicate findings clearly via docs, experiment reports, design reviews, and cross-functional discussions.
What we’re looking for (must-have)
- 3+ years of hands-on experience in applied ML/GenAI (industry or research labs).
- Strong ML fundamentals, with deep practical knowledge of RAG pipelines and their key tradeoffs (latency, quality, cost, grounding, recall vs precision, hallucination control).
- Thorough understanding of LLMs and how they work: tokenization, decoding, prompting behavior, context limits, instruction tuning, failure modes, and mitigation strategies.
- Strong academic background (publications, thesis work, or research track record) and familiarity with how researchers work—citations, evidence standards, paper structure, literature review workflows, and scholarly rigor.
- Proven ability to build and iterate quickly: strong PoC mindset with a bias toward measurable outcomes.
- Experience building evals for GenAI products (automatic + human evaluation), including dataset construction, metrics, and continuous evaluation/regression.
- Strong prompting skills (prompt design, structured prompting, tool-use prompting, rubric prompts for evaluation).
- Excellent communication skills—can explain complex model behavior and tradeoffs to technical and non-technical stakeholders.
- Strong product mindset: cares about user workflows, measurable impact, and shipping reliable systems.
Big pluses
- Hands-on experience with LangGraph and agent architectures (planning, tool calling, state machines, multi-agent patterns).
- LLM fine-tuning experience, especially RLHF / preference optimization (or adjacent approaches like reward modeling, DPO-style methods).
- Past publications in top conferences
- Experience with information retrieval / NLP in research-heavy domains (scientific/biomedical/technical corpora).
- Strong understanding of backend engineering and fundamentals of system design
CTC -29-44lpa
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application