/

Insights

Aug 8, 2025

Aug 8, 2025

NVIDIA, SLMs, and why small might just be the future of AI (again)

NVIDIA is betting big on Small Language Models (SLMs), and at Pieces, we've been building for this future all along. Learn how nano models, local inference, and smarter AI architecture are reshaping the landscape

NVIDIA’s recent paper, “Small Language Models are the Future of Agentic AI,” dropped in June 2025 and immediately sparked conversations across X

But if you’ve been following the space, or us, you’ll know we highlighted this trend In February 2025:


👉 Why companies are turning to small language models?

Now that the current is catching on, let’s break it down:

NVIDIA argues that Small Language Models (SLMs, <10B parameters) can rival or outperform larger models on narrow, repetitive tasks like intent classification, tool use, and structured response generation. They’re more efficient, faster to deploy, and significantly cheaper.

Instead of relying solely on monolithic LLMs (like GPT-4), NVIDIA suggests a heterogeneous architecture: let SLMs do the heavy lifting for routine subtasks and use LLMs selectively for broader context or deep reasoning.

Sound familiar?

That’s exactly the future we’ve been building toward at Pieces, with an even more radical angle: Nano Language Models (≤100M parameters) that run locally, on-device. 


Nano > LLM? Let’s talk costs

Model Tier

Energy per Token

Source

Nano (≤100M)

1 mJ

GreenAI, Intel CPU tests

Cloud LLM (70B)

3–4 J

MIT SuperCloud benchmarks

Frontier LLM (~1T)

8–12 J

LLM Tracker survey

SLMs (1–34B)

0.04–2.0 J

Interpolated from MIT & Pieces labs


For a real-world comparison

Category

Cloud GPT‑4o

Pieces Nano Models (Local)

Savings

Energy Usage

3.65 kWh

0.001 kWh

↓ ~3.65 kWh

CO₂ Emissions

1.46 kg

0.0004 kg

↓ ~1.46 kg CO₂

Electricity Cost

$0.47

$0.00013

↓ ~$0.47

API Cost

$91.25

$0

↓ $91.25

Total Annual Cost

$91.72

~$0.00013

↓ ~$91.72


For teams? Multiply that

Team Size

Tokens/year

API $ Saved

50 Devs

0.18 B

$4,560

500 Devs

1.8 B

$45,600

5,000 Devs

18 B

$456,000


Some teams wait to adopt. Some are first in line 

Looks like Sam won’t be alone, more teams are about to step into the fast fashion era of SaaS.

Written by

Written by

SHARE

NVIDIA, SLMs, and why small might just be the future of AI (again)

Recent

Sep 15, 2025

Sep 15, 2025

Why developers need AI that actually gets Their context

Tired of re-explaining your codebase to AI every week? Discover why developers need context-aware AI that remembers your workflow. Learn how Workstream Activity, Sources, and Time Ranges in Pieces give you control, continuity, and a searchable memory for your entire dev process.

Tired of re-explaining your codebase to AI every week? Discover why developers need context-aware AI that remembers your workflow. Learn how Workstream Activity, Sources, and Time Ranges in Pieces give you control, continuity, and a searchable memory for your entire dev process.

Sep 11, 2025

Sep 11, 2025

AI memory explained: what Perplexity, ChatGPT, Pieces, and Claude remember (and forget)

Discover the different types of AI memory, how they work, key use cases, and the best prompting approaches to get accurate, context-aware responses

Discover the different types of AI memory, how they work, key use cases, and the best prompting approaches to get accurate, context-aware responses

Pieces IDE plugins
Pieces IDE plugins
Pieces IDE plugins

Sep 5, 2025

Sep 5, 2025

From Browser to IDE: how to carry context seamlessly with Pieces

iscover how Pieces helps developers carry context seamlessly across browser, IDE, CLI, and desktop. From snippet capture to Copilot-powered reuse, learn how to eliminate lost time, preserve continuity, and stay in flow throughout your workflow.

iscover how Pieces helps developers carry context seamlessly across browser, IDE, CLI, and desktop. From snippet capture to Copilot-powered reuse, learn how to eliminate lost time, preserve continuity, and stay in flow throughout your workflow.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.