Scaling laws describe how AI systems improve as you spend more compute, data, and time. They’re broad, empirical patterns—not strict formulas—that help you plan capability gains and budget trade-offs. In modern AI, these patterns show up across the whole lifecycle: when models study (pre-training), practice (post-training), and think at answer time (inference).
Big ideas:
More resources → better quality, with diminishing returns.
Data quality and signal quality matter as much as quantity.
You balance three dials—study, practice, thinking—to hit your accuracy, latency, and cost targets.
1) Pre-Training (Study Time)
The model learns general skills from vast data.
What you scale: training tokens, model size (parameters), training steps/compute.
What improves: broad competence (language, coding, vision), sample efficiency downstream.
Trade-offs: bigger isn’t always better if under-trained or fed noisy/duplicate data. Aim to match data to model size so you don’t “starve” the model.
2) Post-Training (Practice Time)
You shape the base model for specific tasks and preferences.
What you scale: supervised examples, feedback rounds (RLHF/RLAIF), distillation passes, evaluator/verifier quality.
What improves: task accuracy, alignment, safety, instruction-following.
Trade-offs: risk of reward hacking or overfitting; keep diverse, high-signal feedback and honest evals.
3) Inference (Thinking Time)
At answer time, let the model reason and use tools before responding.
What you scale: steps of reasoning (Chain/Tree/Graph-of-Thought), number of samples (self-consistency), tool calls (search, code, calculators).
What improves: reliability on hard problems, grounded answers, fewer mistakes.
Trade-offs: higher latency and cost per query; returns taper, so cap “thinking” to your SLA.