NVIDIA, SLMs, and why small might just be the future of AI (again)
NVIDIA is betting big on Small Language Models (SLMs), and at Pieces, we've been building for this future all along. Learn how nano models, local inference, and smarter AI architecture are reshaping the landscape
NVIDIA’s recent paper, “Small Language Models are the Future of Agentic AI,” dropped in June 2025 and immediately sparked conversations across X.

But if you’ve been following the space, or us, you’ll know we highlighted this trend In February 2025:
👉 Why companies are turning to small language models?
Now that the current is catching on, let’s break it down:
NVIDIA argues that Small Language Models (SLMs, <10B parameters) can rival or outperform larger models on narrow, repetitive tasks like intent classification, tool use, and structured response generation. They’re more efficient, faster to deploy, and significantly cheaper.
Instead of relying solely on monolithic LLMs (like GPT-4), NVIDIA suggests a heterogeneous architecture: let SLMs do the heavy lifting for routine subtasks and use LLMs selectively for broader context or deep reasoning.
Sound familiar?
That’s exactly the future we’ve been building toward at Pieces, with an even more radical angle: Nano Language Models (≤100M parameters) that run locally, on-device.
Nano > LLM? Let’s talk costs
Model Tier | Energy per Token | Source |
Nano (≤100M) | 1 mJ | GreenAI, Intel CPU tests |
Cloud LLM (70B) | 3–4 J | MIT SuperCloud benchmarks |
Frontier LLM (~1T) | 8–12 J | LLM Tracker survey |
SLMs (1–34B) | 0.04–2.0 J | Interpolated from MIT & Pieces labs |
For a real-world comparison
Category | Cloud GPT‑4o | Pieces Nano Models (Local) | Savings |
Energy Usage | 3.65 kWh | 0.001 kWh | ↓ ~3.65 kWh |
CO₂ Emissions | 1.46 kg | 0.0004 kg | ↓ ~1.46 kg CO₂ |
Electricity Cost | $0.47 | $0.00013 | ↓ ~$0.47 |
API Cost | $91.25 | $0 | ↓ $91.25 |
Total Annual Cost | $91.72 | ~$0.00013 | ↓ ~$91.72 |
For teams? Multiply that
Team Size | Tokens/year | API $ Saved |
50 Devs | 0.18 B | $4,560 |
500 Devs | 1.8 B | $45,600 |
5,000 Devs | 18 B | $456,000 |
Some teams wait to adopt. Some are first in line

Looks like Sam won’t be alone, more teams are about to step into the fast fashion era of SaaS.