Built EdgeRunner in pure Swift and Metal from scratch over a weekend. On Qwen3-0.6B Q8_0, it hits 212 tok/s on M3 Max, beating llama.cpp by 16%. Here's what worked.
Mar 31, 2026 deep dive 5 min read
Tag
1 post tagged with Benchmark.
Built EdgeRunner in pure Swift and Metal from scratch over a weekend. On Qwen3-0.6B Q8_0, it hits 212 tok/s on M3 Max, beating llama.cpp by 16%. Here's what worked.