LLM of your choice

Choose from multiple LLMs

Choose from a range of large language models that work best for you and comply with your organization's AI governance rules. Use cloud models or on-device models, or Granite accelerated by your GPU. Easily switch between multi-agent LLMs mid-conversation if your needs change.

Try Pieces for free

Trusted by teams everywhere

Choose from more than 40 LLMs

Pieces supports more than 40 different large language models, both cloud and on-device, so you can choose the LLM that works best for you. We support the most popular cloud models, including Claude 3.5 Sonnet, OpenAI GPT-4o, and Gemini Pro 1.5, as well as the top local models such as Llama 3, Granite, and Gemma. As new models come out, we bring them to Pieces as quickly as we can to enhance your AI workflows.

Run LLMs on device

Pieces supports multiple on-device LLMs, powered by Ollama to make the most of your computational resources, including NVIDIA GPUs, and Apple Silicon-powered Macs. Use these on-device LLMs to chat with Pieces when you are offline, or in security or privacy-focused environments, ensuring efficient task allocation and context management.

Switch LLM mid-conversation

Pieces allows you to change the LLM in use mid-conversation as your needs change. Started a chat with Claude, but need to go offline for a while for a flight? Switch to a local LLM and continue the chat with all your conversation history and context intact. Don’t like the response from Gemini for your use case? Switch to GPT-4o and compare the responses.

Comply with your organization's AI governance rules

AI governance is gaining importance as organizations want to limit how much of their IP, customer data, or other content is shared with AI models. Because Pieces supports so many cloud and on-device LLMs, it is easy to align your developer needs with the requirements of your organization. Bring-your-own-model with models deployed to your infrastructure is coming soon.

You need to try pieces out if you write code and feel that you need a true second brain, where you can basically store any function or code you've ever written and can use it again and again and again.
Henry Rausch
Quality Engineer @ FIC America Corp
Everyone's got a copilot. You're inverted, you've rotated the whole thing. It's not a vertical copilot, it's a horizontal one.
Scott Hanselman
VP of Developer Community @ Microsoft
Pieces Copilot has become much more efficient for any developer to ask any question and get a particular result. The LLMs in Pieces are sensitive to programming, so I think that gives better results.
Ayush Kumar
Data Analyst @ Accenture
I was playing around with live context, and just wow, I’m speechless. I mean, this is not just a coding assistant anymore, it’s a mentor that knows literally everything, a guardian angel.
Domagoj Lalk
CTO & Co-Founder @ Sparroww Inc.

1 million +

saved materials

17 million +

associated points of context

5 million +

copilot messages

Dive into the Pieces technical documentation to explore everything our platform offers

Explore

Learn how to optimize your workflow with Long-Term Memory, on-device AI, and switching between LLM

Find solutions to common issues

Access additional tools, SDKs, and APIs for advanced integration

See what else we offer

With hundreds of tools competing for your attention, Pieces is the OS-level AI companion redefining productivity for software development teams.

Enrich code snippets

Enrich code with AI, track collabors, detect and flag sensitive information.

Pieces where you are

Avoid context switching, stay within your IDE, and bring your workflow to one place.

Pieces Copilot

A developer copilot that knows what you know, not just what the LLM knows.

DOWNLOAD FOR FREE

Select the right LLM to fit your workflow

Get started

DOWNLOAD FOR FREE

Select the right LLM to fit your workflow

Get started

DOWNLOAD FOR FREE

Select the right LLM to fit your workflow

Get started

Frequently asked questions

What are the system requirements for running local models?

Different models have different requirements, with larger models needing more VRAM/RAM to handle complex problem-solving tasks and parallel processing. See our local model documentation for our recommended minimum system specifications and optimal computational resources.

What are the system requirements for running local models?

Do I need to provide an API key for cloud models?

What is the best model to use?

What are the system requirements for running local models?