// Popular Articles
Unsloth sweeps 22/22: Gemma 4 26B-A4B GGUFs are now SOTA
An independent benchmark ranked 80 GGUF quantizations of Google's new Gemma 4 26B-A4B across 6 uploaders. Unsloth's Dynamic 2.0 GGUFs placed #1 in every single one of the 22 tested quant sizes on mean KL divergence — the cleanest sweep we've seen in open-model quantization.
FlashDrive: Reasoning VLA cho xe tự lái chạy real-time — 716ms xuống 159ms, zero accuracy loss
Z Lab vừa công bố FlashDrive, framework co-design kéo latency Vision-Language-Action model từ 716ms xuống 159ms trên RTX PRO 6000 (tối đa 5.7× trên RTX 4090), giữ nguyên accuracy. Bốn kỹ thuật ghép lại: streaming inference, DFlash speculative reasoning, adaptive-step flow matching, ParoQuant W4A8.