How Fast Can a $4,699
Desktop Run LLMs?
11 models from 8B to 123B parameters, benchmarked across inference, coding, context scaling, and vision on NVIDIA's GB10 Blackwell architecture.
About This Benchmark
Nova Bench is the first comprehensive benchmark suite built specifically for the NVIDIA DGX Spark — a $4,699 desktop supercomputer powered by the GB10 Blackwell Superchip with 128GB of unified LPDDR5x memory.
All benchmarks were run on a single DGX Spark unit using Ollama 0.18.3 as the inference framework. Models range from 8B to 123B parameters, covering general inference, code generation, context scaling, and vision tasks.
Hardware
The DGX Spark features a GB10 Grace Blackwell architecture with 20 ARM cores (10x Cortex-X925 + 10x Cortex-A725), 128GB unified memory shared between CPU and GPU via NVLink-C2C, and a 4TB NVMe SSD. The GPU delivers up to 1 PFLOP of FP4 performance.
Methodology
Each model was tested with the same prompt: "Explain quantum computing in exactly 200 words" for general inference. Coding benchmarks used a palindromic substring problem. Context scaling compared a 19-token prompt against a 130-token complex business analysis. All results are reproducible — raw data is available on GitHub and HuggingFace.
Reproduce These Results
All benchmark data, scripts, and methodology are open source. Clone the repo and run on your own hardware:
Dataset available at: huggingface.co/datasets/G3nadh/dgx-spark-benchmarks