Tools Notable

NVIDIA DGX Spark Still Missing Its Core Feature After Six Months

April 4, 2026 2 min read

Image: NVIDIA

When NVIDIA announced the DGX Spark, the pitch was specific: Blackwell GPU architecture combined with NVFP4 precision support, in a local AI workstation you could own outright. NVFP4 - NVIDIA's 4-bit floating point format, a method of compressing AI model weights so they fit in less memory and run faster on Blackwell hardware - was central to why the product made sense at its price point.

Six months after launch, NVFP4 still has not shipped. Users who bought the hardware are running models at FP8 precision instead. That is not a minor gap.

What Quantization Actually Does

Quantization is how you fit large AI models onto hardware with limited memory. AI model weights - the billions of numerical parameters that determine how a model responds - are stored in high-precision formats during training, typically FP16 (16 bits per value). Quantization compresses those to 8 bits (FP8) or 4 bits (FP4), trading a small amount of accuracy for dramatically lower memory requirements and faster computation.

A model that needs 80GB of memory at FP16 might need only 20GB at FP8, or around 10GB at FP4. On hardware with a fixed memory ceiling, that difference determines which models you can run at all.

NVFP4 is NVIDIA's hardware-optimized 4-bit format, built specifically to take advantage of Blackwell's matrix computation units. It was supposed to be the efficiency layer that made the DGX Spark's bandwidth limitations - a consequence of the hardware's physical design - acceptable. Without it, those bandwidth limitations are not compensated for.

What Buyers Are Actually Getting

The DGX Spark targets developers who want to run large language models locally: no API costs, no data leaving the machine, professional NVIDIA tooling around the whole setup. The advertised value was Blackwell hardware operating at full efficiency.

What buyers are actually getting is Blackwell hardware at FP8 efficiency - meaningfully less capable than advertised. Fewer models fit in memory. Throughput is lower. The bandwidth constraints that NVFP4 was supposed to offset are still present, uncompensated.

NVIDIA has not provided a timeline for when NVFP4 support will arrive. If you are considering a DGX Spark purchase, wait. The hardware is not broken - it just is not delivering what made it worth buying. Once NVFP4 ships and independent benchmarks follow, the purchase decision becomes much clearer. Right now you would be paying a premium for a promise that is already six months overdue.

What Quantization Actually Does

What Buyers Are Actually Getting

Related Tools

More from today

Claude Code Caches Session History and Secrets in Plaintext

ChatGPT Users Say the Model Has Become Cold and Preachy After Sycophancy Fix

AI Has Skipped Product Discovery - and PMs Are Still Doing It by Hand

Cookie Preferences