Engineering

Apr 15, 2025

Introducing Pieces MCP Server: Add Long-Term Memory to your favorite AI tools

We just released the Pieces MCP server, making it possible to interact with Pieces Long-Term Memory from any MCP-compatible client, including GitHub Copilot and Cursor (very cool, I know).

Now, you can ask your tools to do things based on your actual work history. Imagine saying:

"Based on the conversations I had yesterday with Laurin, update my package manifest to use the latest versions."

Your MCP client will call the Pieces MCP server, retrieve memories of that conversation, and use the client’s agent to apply the update directly inside your project.

What the Pieces MCP Server Does

If you aren't up-to-speed on MCP, check out our blog post called “What the heck is MCP and why is everyone talking about it?”.

The goal of the Pieces MCP server is simple. It lets you bring Pieces Long-Term Memory into whatever tools you already use. Instead of switching tabs, windows, or contexts, your memories become part of your workflow.

Essentially, you can keep Pieces running in the background and connect the MCP Server to your favorite tools. This way, you still use the AI tools/LLMs you know and love, but now they’ll respond with all of the great context captured in Pieces LTM.

You can ask any question in Cursor or GitHub Copilot chat, and Pieces will respond with the right context. Those memories are then processed by the client’s own LLM, giving you powerful agentic workflows built on your actual activity history.

How can I actually use it though?

Here’s how you can try it yourself:

Make sure Pieces is updated to the latest version.
Grab your local MCP server URL from the Pieces menu bar.
Add it to your MCP client (Cursor, GitHub Copilot, etc.).
Ask Pieces a time-based or source-aware question.

For detailed tutorials, check out our setup guides:

And, if you prefer to watch videos, check out the official walkthroughs here:

Pieces MCP for GitHub Copilot:

Pieces MCP for Cursor:

Compatibility

We chose SSE (Server-Sent Events) for transport, because it's simple, efficient, and already supported by PiecesOS. Unlike stdio-based MCP servers that require installing Node or other dependencies, the Pieces MCP server runs natively from your existing setup.

SSE works great with:

Cursor
GitHub Copilot
Windsurf
Cline
Goose

Note: Claude Desktop doesn’t support SSE yet. If you’re using Claude, you can bridge with an open-source gateway like lightconetech/mcp-gateway.

What kind of questions can you ask?

Try questions like:

"What was I working on yesterday?"
“Based on the meeting notes with Judson, update the README with the action items he mentioned.”
“Using the feedback from yesterday’s PR review, refactor the helper methods in utils.py.”
“Summarize the create signup page GitHub issue I was just reading. Implement this issue in this project.”

If your MCP client supports tool calling, it’ll decide when to call Pieces. You can also force it by saying, "Ask Pieces to…”

Under the hood

Now, if you’re like me, you’re probably thinking “yeah, yeah, this is cool, but what’s REALLY going on?”

Here’s what happens:

Your MCP client sends the prompt and tool metadata to its LLM.
The LLM determines that the ask_pieces_ltm tool should be used.
The client sends the tool request to the Pieces MCP server.
Pieces returns relevant memories, and the client uses its LLM to generate a response.

Pieces doesn’t process, it just sends back context. This keeps it modular and secure. Unlike stdio-based MCP servers needing Node or other dependencies, ours runs natively from your setup.

Supported features

Access from any MCP client: Works with tools like Cursor and GitHub Copilot (and soon Cline, codename goose, and many more!).
Time and source-based prompts: Ask about what you were doing last week, or filter by a specific app like Slack, Chrome, or VS Code.
Combine with other agents: Use the client LLM to make informed, contextually-aware developer modifications to codebase or active file.

Cost and token considerations

Each MCP tool call adds some token overhead. Your client LLM has to:

Include all tool descriptions in the first prompt.
Process the memory output in a second prompt.

if you aren't actively using Pieces, you can disable it to reduce token usage, then re-enable it when needed.

Try it out yourself

Install Pieces, open your MCP client of choice, and start asking questions with context. Whether you’re trying to recall what happened last Tuesday or use that info to generate code, Pieces is ready to help.

Let us know how you end up using Pieces MCP on X, LinkedIn, Bluesky, or Discord.

Happy coding!

Written by

Ellie Zubrowski

Introducing Pieces MCP Server: Add Long-Term Memory to your favorite AI tools

…

Get started

Recent

Aug 13, 2025

How does gpt-oss compare to Gemma 3n architecture?

Inside our ML team’s week-long debate on OpenAI’s newly open-sourced GPT-OSS models versus Google’s Gemma3N architecture, from kernels and quantization tricks to efficiency, multimodality, and the quiet arrival of local AI’s future.

Aug 12, 2025

Visionary AI investor Flat Capital Invests in Pieces to Accelerate Artificial Memory For Individuals and the Enterprise

We’re thrilled to welcome Flat Capital as a new investor in Pieces. Learn more about this exciting partnership and what it means for the future of local-first AI.

Aug 12, 2025

From IDE to deployment: 9 Best AI tools for Python

We put the top AI tools for Python coding to the test, not just to see which writes code the fastest, but which actually feels good to use, fits into your workflow, and makes building in Python more enjoyable.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.