This Week's Learning Nuggets (W25-2025)
As always, these notes capture my current understanding. Fully expect future‑me to refactor or flat‑out contradict some of this later.
1. LLM Quantization Moves the Pareto Frontier
Reading: ParetoQ scaling laws in extremely low‑bit LLM quantisation
Quick gist: ParetoQ shows that 2‑ and even 1.58‑bit weight‑only models can match 4‑bit accuracy if you fine‑tune for ~10 % of the pre‑training budget. That nudges the size-accuracy Pareto curve up and left.
Edge‑deployments are one of the few ways we’ll run private LLMs without melting graphics‑processing units (GPUs). Cheaper inference ≠ free robustness, though. Math/code reasoning still dips unless you sprinkle back a few higher‑precision layers or run a tiny Low‑Rank Adaptation (LoRA).
Mental‑note: Weight‑only is the low‑hanging fruit. Activation quantization is the next cliff.
2 · Safety Cases Are the New Driver’s License
Reading: Waymo — Safe to Deploy
Waymo published a 12‑point “absence of unreasonable risk” checklist that every new software build must pass before a driver‑out rollout.
Design‑time proof > rear‑view metrics. One buggy binary ships to a fleet, not a single driver; correlated risk demands aviation‑style process.
Pragmatic and moat‑building: only deep‑pocket teams can fund the Monte‑Carlo + closed‑course coverage they require. Smaller autonomous‑vehicle (AV) startups need shared scenario libraries or they’re frozen out.
Sticky insight: a human crash is tragic but local; a software crash is global at machine speed.
3 · World‑Models ≥ Pixels
Reading: Meta — Our new model helps AI think before it acts
V‑JEPA 2 masks chunks of video and predicts their latent embeddings, not RGB pixels. This dodges the usual blurry‑future plague and gives you a cheap “physics sandbox.”
Pre‑trained on a million hours of internet video + 6 h of robot fine‑tune → a six‑degrees‑of‑freedom (6‑DoF) arm can pick & stack unseen objects out‑of‑the‑box.
Potential crossover: dash‑cam pre‑train ➟ fine‑tune on fleet logs. Metric grounding and long‑horizon drift still unsolved, but I smell a playbook for smaller robotics teams.
4 · When Control Plane ≠ Blast Shield
Reading: Google Cloud outage incident OW5I3… + ByteByteGo breakdown
One malformed quota‑policy row hit an un‑handled null pointer in Service Control—Google’s inline API auth/quota gate in front of every API.
Instant Spanner replication → global crash‑loop in < 2 min. The real downtime (3 h) came from a self‑inflicted thundering‑herd of restarts starving Spanner.
Take‑aways:
Config deserves the same staged rollout as code.
Critical inline services need a fail‑open brown‑out mode for read‑only requests.
Randomised back‑off is not “nice to have”; it’s a distributed denial‑of‑service (DDoS) antidote.
End of notes; happy to hear what resonated / what I botched!