Tất cả bài viết

// Popular Articles

#gb10

#4422025-10-04

200 tok/s, 49W: Qwen3.6-27B-FP8 Runs Flagship Coding on a Single DGX Spark

A day after Alibaba shipped Qwen3.6-27B, engineer Mitko Vasilev posted a number that should make every indie AI builder look twice: 200 tokens/sec peak, 136 tok/s average, 256k context, 10 concurrent agents — on one NVIDIA GB10 drawing just 49 watts. Here is what the stack is doing and why the tok/s-per-watt curve just bent.

qwen3-6dgx-sparkgb10

6 phút đọc