Gaurav Gada

Applied Scientist, (Part-Time) Musician, Electrical Engineer, Runner

The Amazon Spheres, Seattle, WA

Hi, I'm Gaurav — an Applied Scientist with 8+ years building NLP and LLM systems at scale. Right now I'm working on ad relevance: benchmarking frontier LLMs, distilling calibrated judgments into low-latency production models, and wiring up multi-agent orchestrators that compress multi-day scientific deep-dives into a couple of hours of supervised analysis. Before that I was a founding scientist on a content moderation and AI safety team, where I grew the science org from 1 to 5 and shipped hate speech detection, social engineering detection, and Responsible AI red-teaming workstreams for products used by hundreds of millions of people. I've filed two patents with the USPTO along the way. These days I spend most of my time thinking about LLM evaluation, human-in-the-loop systems, and agentic patterns — and the messy practical tradeoffs (latency, cost, hallucinations, context bloat) that decide whether any of it actually ships. Outside of work I'm a part-time musician, runner, and recovering electrical engineer. Thanks for dropping by — drop a note if you want to talk shop. Apart from deep learning (pun intended), I'm always up for a deep conversation with practitioners in the field.

Posts

Understanding Attention: A Code-First Journey Through Transformers

Build attention mechanisms from scratch in PyTorch. We'll start with raw tensors and progressively build to multi-head attention, explaining every reshape, transpose, and dimension along the way.

The 10% You Should Never Automate

Everyone's asking what AI can do. The better question is what you shouldn't let it do. Frameworks for deciding what to automate and what to protect.

When Should You Build an AI Agent? A Practical Decision Framework

Practical framework to determine when AI agents make sense for your use case. Learn when to build agents and when simpler approaches like prompt engineering or RAG work better.

Mistral 7B on consumer hardware

Run Mistral 7B locally on Mac with Ollama for fast seed data generation. Learn CLI setup, prompt formatting, and downstream parsing to generate thousands of samples on consumer hardware.

Finding the right words

Understand how LLMs choose words during generation. Learn temperature, top-k, and top-p sampling strategies to balance coherence, diversity, and task-appropriateness in generated text.

Paper Review - Embers of Autoregression

Critical review of LLM limitations in low-probability situations. Explores why AI practitioners should understand autoregressive training pressures before deploying LLMs for tasks requiring precise reasoning or uncommon patterns.

Multi-label text classification

Learn to build a multi-label text classifier using DistilBERT with imbalanced classes. Covers binary cross-entropy loss, multi-hot encoding, and practical implementation strategies for handling multiple labels.

Library version mismatches declared not safe

Critical lessons on matching Python package versions between model development and inference. Learn about safetensors format advantages and why version mismatches cause production failures.

Mining word collocations

Extract common bigrams and trigrams from text using Gensim and NPMI scoring. Learn to mine jargon, phrases, and collocations from customer reviews, feedback, and text corpora.

Science Talk: Generative LLMs

Comprehensive introduction to generative LLMs covering basics, training processes, and real-world applications. Slides from talk delivered to 70+ attendees.

Subscribe

All the latest posts directly in your inbox.