Benchmarks
Apple M4 Max first NPU benchmarks: tflops per watt analysis
First inference benchmarks on the M4 Max 38 TOPS NPU: ViT-L/16 throughput, INT8 quant impact, and tflops per watt vs 4090.
By Lukas Berg ·
Established 2023 · Updated continuously
ML Systems Review is an independent engineering publication covering production machine learning systems — architecture case studies, benchmarks, and long-form investigations into how AI products actually work. No sponsorship, no affiliate links, no marketing copy.
Benchmarks
First inference benchmarks on the M4 Max 38 TOPS NPU: ViT-L/16 throughput, INT8 quant impact, and tflops per watt vs 4090.
By Lukas Berg ·
ML Ecosystem
Kernel generator, KV cache rearrangement, Metal/CUDA backend unification — a 2.1x throughput delta on 70B quantized models.
By Priya Ramachandran ·
Model Architecture
Reading notes on the DeepSeek-V3.5 release: MoE routing updates, efficiency gains, and which contributions hold up versus rebadged 3.1.
By Dr. Marcus Brennan ·
ML Ecosystem
Transformers 5.0, Spaces v2, revised Inference Endpoints pricing, Diffusers consolidation.
By Priya Ramachandran ·
Case Study
By Dr. Nadia Volkov ·
Distributed Systems
How Figma's multiplayer engine keeps hundreds of concurrent editors in sync with CRDTs, plus the operational-transform alternative they rejected.
By Priya Ramachandran ·
Distributed Systems
BEAM scaling walls, NIF interop, and the selective Rust migration pattern Discord used for their hot-path services.
By Priya Ramachandran ·
Architecture case studies, reproducible benchmarks, MLOps and reliability, and post-mortems of production ML failures. Topics where the engineering matters more than the model.
Five engineers and researchers with graduate degrees from Stanford, CMU, Berkeley, and Oxford, and a decade-plus of combined production ML experience across startups, mid-sized tech, and consulting. Every article is reviewed for technical accuracy before publication.
MLSR is founder-funded. We take no sponsorships, affiliate commissions, or paid placements. See our editorial standards for the full policy.