
Hi, I'm Gaurav — an Applied Scientist with 8+ years building NLP and LLM systems at scale. Right now I'm working on ad relevance: benchmarking frontier LLMs, distilling calibrated judgments into low-latency production models, and wiring up multi-agent orchestrators that compress multi-day scientific deep-dives into a couple of hours of supervised analysis. Before that I was a founding scientist on a content moderation and AI safety team, where I grew the science org from 1 to 5 and shipped hate speech detection, social engineering detection, and Responsible AI red-teaming workstreams for products used by hundreds of millions of people. I've filed two patents with the USPTO along the way. These days I spend most of my time thinking about LLM evaluation, human-in-the-loop systems, and agentic patterns — and the messy practical tradeoffs (latency, cost, hallucinations, context bloat) that decide whether any of it actually ships. Outside of work I'm a part-time musician, runner, and recovering electrical engineer. Thanks for dropping by — drop a note if you want to talk shop. Apart from deep learning (pun intended), I'm always up for a deep conversation with practitioners in the field.
Posts
Understanding Attention: A Code-First Journey Through Transformers
The 10% You Should Never Automate
When Should You Build an AI Agent? A Practical Decision Framework
Mistral 7B on consumer hardware
Finding the right words
Paper Review - Embers of Autoregression
Multi-label text classification
Library version mismatches declared not safe
Mining word collocations
Science Talk: Generative LLMs
Projects
Skill Quality Coach
Data Science: Analyzing crime stats in Seattle and San Francisco
Subscribe
All the latest posts directly in your inbox.