Insights

Dec 20, 2024

Exploring Microsoft’s Phi-3-Mini and its integration with tools like Ollama and Pieces

Discover Microsoft’s Phi-3-Mini and its seamless integration with tools like Ollama and Pieces, enhancing AI-driven workflows.

As a developer who has worked with various Large Language Models (LLMs), I’ve seen firsthand how the right model can transform a coding workflow. In the world of Large Language Models we often forget small language models and their capabilities.

Recently, Microsoft unveiled Phi-4, a new addition to their small language model series that is much better and comes with enhanced completion quality and smaller model sizes.

Phi-4 builds upon the groundwork established by Phi-3 and its variants.

While our primary focus in this article will be on phi-3-mini, it’s encouraging to see Microsoft continuously innovating.

If you find phi-3-mini intriguing, you might want to keep Phi-4 in mind as a potential upgrade for even better capabilities.

For more details, check out Microsoft’s official announcement.

In this blog, I’ll guide you through what makes the phi-3-mini an awesome choice.

We'll explore its core strengths, how to run it via Ollama, and how to integrate it with tools like the Pieces.app.

By the end, you'll have a clear understanding of why phi-3-mini – especially variants like phi-3-mini or phi-3-mini-128k-instruct– should be on your toolbox as an AI Practitioner.

What is Phi-3-mini?

Phi-3-mini is a 3.8 billion-parameter language model developed by Microsoft as part of the Phi-3 series.

Developed for efficiency, it provides performance that is comparable to that of larger models like the GPT-3.5, while being optimized for devices with limited computational resources, such as notebooks and smartphones.

Phi-3-mini excels in reasoning and coding, making it ideal for offline applications and systems which don't require a long context window and heavy computing.

What sets Phi-3 Apart as an LLM?

Phi-3 itself is a notable evolution in large language models, building upon the successes of GPT-3 and Phi-2.

Phi-3’s improvements aren’t just incremental; the inclusion of extended context windows, such as in phi-3-mini-128k-instruct, fundamentally changes how you provide information to the model and its use cases

With a phi 3 mini context window that can handle extensive documentation or multiple related files, you can maintain coherence in code suggestions even as complexity grows.

What phi-3 is good for?

From my experience, what makes Phi-3 family models particularly strong is their ability to handle vast, nuanced prompts effectively. It is relatively good for the given size for the tasks of language, reasoning, coding, and math benchmarks

By incorporating phi 3 mini synthetic data and employing architecture that supports extensive context (for example, phi 3 mini 128k instruct or phi 3 mini 4k instruct variants), these models deliver more accurate, context-aware code completions and reasoning.

Can you generate code with Phi-3 Mini?

Phi-3-mini emerges from Microsoft’s Phi-3 family of models, designed to excel in both text and code generation.

This “mini" variant channels the strengths of Phi-3 into a slimmer package, ensuring you retain robust performance without excessive resource demands.

Whether you’re working within a constrained environment or looking to integrate smart code assistance into smaller workflows, phi-3-mini is engineered to adapt—think scenarios like phi 3 mini android integrations or containerized services.

We have incorporated some part of phi 2 model in Pieces for on-device ml, more on that in a moment 😉

How to run phi-3 mini

Getting started with phi-3-mini is relatively straightforward.

You can download phi 3 mini directly from sources like Hugging Face. Check the phi 3 mini requirements to ensure compatibility.

Most teams find that a phi-3-mini robust structure translates into easier deployments on local machines or modest cloud setups.

For organizations needing full-scale enterprise solutions, consider azure deploy phi-3.5-mini.

As highlighted in Azure’s blog post, Azure’s support ensures production-grade stability, security, and the convenience of scaling up or down as per your project requirements.

Integrating phi mini models with Pieces

At Pieces, we’ve previously showcased what’s possible by building copilots with earlier Phi models.

In our copilot integration with Phi-2, we demonstrated how to connect an LLM’s capabilities to the Pieces Client seamlessly.

The same strategy applies to phi-3-mini: Pieces can store and manage a library of snippets, supply real-time context, and make feeding that context into phi-3-mini effortless.

This synergy accelerates tasks like code generation, refactoring, and maintaining a consistent coding style across your team.

Pieces CLI would be an awesome place to experiment with some of these capabilities.

A practical example with Ollama and Pieces

To illustrate phi-3-mini in action, I recommend exploring its integration with Ollama and Pieces CLI.

Ollama allows you to interact with LLMs locally, providing a straightforward environment for experimentation.

Pieces complements this by managing and organizing your code snippets, ensuring that whenever you return to phi-3-mini for refinement or a fresh snippet, everything is at your fingertips.

Prerequisites:

Ollama installed and configured.

The phi-3 model (or phi-3-mini variant) available locally:

ollama pull phi3

Pieces CLI installed and authenticated.

Steps for generating code with Ollama

Expect a code snippet that would do just this. Review it to ensure it meets your needs, and try prompting differently for a better outcome if you need to.

Store snippet in Pieces

Add the returned code to Pieces for easy retrieval later.

Make sure Pieces CLI is installed and properly configured on your machine.

Copy the code from the terminal. Pieces CLI can access your pastebin to store this code.

Run the Pieces create command in the terminal, and it'll get the code from your machine and store it in the Pieces platform, which you can use later on anywhere.

Pieces create

What to see your saved snippet? Use the below command or you can simply organise it in the Pieces for Developers Desktop App.

Pieces list

Refine as needed

Retrieve the snippet, refine it with phi-3-mini through Ollama by adjusting the prompt, and store the improved version back in Pieces.

Over time, you build a curated library of production-ready snippets, shaped by your specific needs.

What is the downside of the Phi-3 Mini?

Though the Phi-3 model excels at a lot of things, one thing it still lacks is maintaining the context window overflow and when it does overflow the model replies back with gibberish.

This is reported by the community and is likely to be fixed in the future ONNX release.

Wrapping it up

LLM phi-3-mini demonstrates that smaller doesn’t mean weaker.

By delivering strong code generation and reasoning capabilities, handling extensive contexts gracefully, and offering adaptability through fine-tuning, phi-3-mini positions itself as an invaluable tool in modern development pipelines.

When combined with tools like Ollama and Pieces, it can significantly enhance productivity, consistency, and the overall developer experience.

If you’re looking to streamline code generation, maintain context across complex projects, and integrate seamlessly with existing workflows, phi-3-mini deserves serious consideration.

With Microsoft’s introduction of Phi-4, the Phi series continues to push the boundaries of what smaller language models can achieve.

As you integrate phi-3-mini into your workflow and potentially fine-tune it for specific tasks, know that there’s a natural progression toward even more capable models like Phi-4.

This ensures you’re not just investing in a single model, but in an evolving ecosystem that grows with your needs.

Written by

Ali Mustufa Shaikh

Exploring Microsoft’s Phi-3-Mini and its integration with tools like Ollama and Pieces

…

Try Pieces

Recent

Nov 14, 2025

11 Best IDEs for Python developers in 2025

Nov 12, 2025

Your go-to daily standup agenda and how to actually make it work

Discover how to run daily stand-up meetings that actually drive results. Learn the meaning of a stand-up meeting, see classic and modern agenda examples, and explore best practices for team stand-ups.

Sep 18, 2025

How to use AI Memory in Claude desktop (and why It changes everything)

Explore how prototypes lay the foundation for long-term memory in AI. Learn why early experiments, iteration, and design “blueprints” are critical for building durable, context-rich intelligence.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.

our newsletter

Sign up for The Pieces Post

Check out our monthly newsletter for curated tips & tricks, product updates, industry insights and more.