/

Insights

Aug 8, 2025

Aug 8, 2025

NVIDIA, SLMs, and why small might just be the future of AI (again)

NVIDIA is betting big on Small Language Models (SLMs), and at Pieces, we've been building for this future all along. Learn how nano models, local inference, and smarter AI architecture are reshaping the landscape

NVIDIA’s recent paper, “Small Language Models are the Future of Agentic AI,” dropped in June 2025 and immediately sparked conversations across X

But if you’ve been following the space, or us, you’ll know we highlighted this trend In February 2025:


👉 Why companies are turning to small language models?

Now that the current is catching on, let’s break it down:

NVIDIA argues that Small Language Models (SLMs, <10B parameters) can rival or outperform larger models on narrow, repetitive tasks like intent classification, tool use, and structured response generation. They’re more efficient, faster to deploy, and significantly cheaper.

Instead of relying solely on monolithic LLMs (like GPT-4), NVIDIA suggests a heterogeneous architecture: let SLMs do the heavy lifting for routine subtasks and use LLMs selectively for broader context or deep reasoning.

Sound familiar?

That’s exactly the future we’ve been building toward at Pieces, with an even more radical angle: Nano Language Models (≤100M parameters) that run locally, on-device. 


Nano > LLM? Let’s talk costs

Model Tier

Energy per Token

Source

Nano (≤100M)

1 mJ

GreenAI, Intel CPU tests

Cloud LLM (70B)

3–4 J

MIT SuperCloud benchmarks

Frontier LLM (~1T)

8–12 J

LLM Tracker survey

SLMs (1–34B)

0.04–2.0 J

Interpolated from MIT & Pieces labs


For a real-world comparison

Category

Cloud GPT‑4o

Pieces Nano Models (Local)

Savings

Energy Usage

3.65 kWh

0.001 kWh

↓ ~3.65 kWh

CO₂ Emissions

1.46 kg

0.0004 kg

↓ ~1.46 kg CO₂

Electricity Cost

$0.47

$0.00013

↓ ~$0.47

API Cost

$91.25

$0

↓ $91.25

Total Annual Cost

$91.72

~$0.00013

↓ ~$91.72


For teams? Multiply that

Team Size

Tokens/year

API $ Saved

50 Devs

0.18 B

$4,560

500 Devs

1.8 B

$45,600

5,000 Devs

18 B

$456,000


Some teams wait to adopt. Some are first in line 

Looks like Sam won’t be alone, more teams are about to step into the fast fashion era of SaaS.

Written by

Written by

SHARE

NVIDIA, SLMs, and why small might just be the future of AI (again)

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.