Writing

Notes on LLM scaling, distributed training, and building the research environment and teams.

June 3, 2026
Research Problems in Pretraining

A practitioner's map of the pretraining landscape: scaling laws, principled parameterization, optimizer frontiers, and the open mysteries that keep the field honest.
April 15, 2026
Your Org Has the Same Scaling Problem as a Badly Tuned Training Run

AI raised individual throughput but coordination overhead stayed fixed. For many product-engineering orgs, the bottleneck flipped from compute-bound to communication-bound.
May 20, 2025
On Assessing the Value of a Project

A simple framework for picking the most impactful work — with practical tips for estimating chance of success, effect size, and applicable scenarios.
December 2, 2024
Determining Model Size and Training Horizon through Scaling Laws

A framework for optimizing model size and training token count from scaling-law coefficients — accounting for inference cost, data repetition, and system efficiency.
July 24, 2018
GluonNLP — Deep Learning Toolkit for Natural Language Processing

Introducing GluonNLP, an open-source toolkit that tackles the reproducibility crisis in NLP research with stable APIs, reusable components, and centralized resources.

Research Problems in Pretraining