Back
Your practical developer’s guide to LLM prompt engineering
Prompt engineering involves crafting specific questions for LLMs to receive targeted responses, tailored to the audience and desired format.
In the 2006 James Bond movie, Casino Royale, Bond is seated at the Poker table and is offered a drink. First, he asks for a dry martini, but then quickly corrects himself, and asks for:
3 measures of Gordons
1 of Vodka
Half a measure of Kina Lillet
Shake it over ice and then add a thin slice of lemon peel
Now apart from making me thirsty and giving me a desire to spend time back in the novels of Ian Flemming, there is a point to this. If Bond had stuck with ‘dry martini’, he’d have just ended up with a standard martini, maybe with gin, maybe with vodka. Instead by providing more detailed instructions, he enjoyed a Vespa, a far superior choice. This is very similar to how we interact with large language models or LLMs. The clearer the instructions we provide the LLM, the better and more aligned with our needs the output will be.
This process is called Prompt Engineering, and this post will cover some of the basics of prompt engineering to help you get the answers you need from AI developer tools like Pieces.
What is prompt engineering?
Seeing as prompt engineering is related to LLMs, let’s ask an LLM to define prompt engineering:
Using the Pieces Copilot, I asked Claude 3.5 Sonnet the following question: “Give me a 100 word summary of prompt engineering aimed at a non-technical user of tools like ChatGPT”
And got the following answer:
To summarize – prompt engineering is coming up with an effective question to ask the LLM so that you get the answer you need in the format you want. In this question, I not only asked for a summary of prompt engineering, but I specified the word limit and the target audience. This ‘engineered’ a short, 103 word, response that was targeted towards a non-technical user. If I had asked for a longer summary aimed at a developer, the response would be different.
As an example for developers, if you ask an LLM to give you a class to represent a user, it will probably give you a class in a popular language like Python or JavaScript, with a wide range of fields. Not helpful if you want a User class in C# that just has Name
, Email
, and PhoneNumber
fields. To get a better response you need to specify the language and required fields.
Let’s now dig into what makes a good prompt, and some techniques for getting the most out of your conversations with an LLM.
What are the components of a good LLM prompt?
The summary of prompt engineering by Claude mentioned above starts with “Prompt engineering is the art of crafting effective instructions”. The term art is used, and whilst there is a certain amount of art to create prompts based on experience, it is more science than art. There are some components that make up an effective prompt.
Context
Context is the additional information that the LLM needs to understand your question beyond what it has already been trained on. By adding more context you can provide less information in the user question, and you can get a more relevant answer.
For a developer-focused AI tool, the context you need probably comes from existing files or folders of code – this is how you can ask questions about an existing project. There are 2 ways to add context to your prompt, either directly inside the chat or by leveraging the features of the tool you are using to pull in files or folders.
To add the context directly to the chat, you add the code inline:
In this example, I am asking the following question:
This provides the code that the LLM needs. This is ideal for small snippets of code, but less easy to do when you are dealing with large blocks of code or even multiple files. This is where a good AI developer tool helps, allowing you to set the context using individual files, or folders of code (or even the Pieces Long-Term memory across everything you are doing on your developer machine). The more relevant context you have, the better the answer.
The one consideration you need to have is that yes, the more relevant context you have, the better the answer, but equally the more irrelevant context you have, the worse the answer. LLMs have a limited context window - the maximum size you can send to the LLM. If the context is too large, the LLM won’t be able to process your prompt, and you will need to rely on your chat tool to strip down the context to just what is relevant.
You can narrow down the context, by using relevancy detection so that you can pass in a folder of code and have just the relevant context passed to the LLM, but you still guide this.
If you pass in multiple projects as context, the LLM is going to struggle to give you a good answer to a project-related question as it tries to find just the relevant context to pass to the LLM. It is better to have the smallest context that is relevant to the question you are asking.
User question
The user question is the core of what you want to ask the LLM.
For example, you can ask a question relying on the LLMs training data and any context added, or define a problem and ask for a solution. When you are planning the question, it is good to make it a single purpose, short, and specific.
Single purpose
Ask for a single thing. LLMs are great when given a single task to do, but don’t do so well when asked for multiple things at once. It will try, but the answer will contain a mixture of all the tasks so will lack detail. You will get a better response by asking multiple questions, one per task. As well as having a limited context window for the inputs to an LLM, they also have a smaller output context window - the more disparate topics that have to fit into the output window, the smaller the answer for each will be.
For example, if you want to comment on a class, and refactor another class, you would do these as 2 separate prompts, not one that combines both.
Specific
Be specific with what you want from the LLM. If you are vague, the answer probably won’t align with your needs. “Give me a user class” will give you a class in whatever the LLM thinks is the most likely programming language. “Give me a user class in C#” will give you a class in C#, “Give me a user class in C# with UserId, Name, and PhoneNumber properties. Add equality operators, make the UserId read-only, and use a primary constructor” this will give you a very precise class definition with just the properties you need, equality, and a primary constructor that sets the read-only UserId
.
Short
LLMs work better with concise prompts. The clearer the instruction, the better the answer. Short, concise prompts reduce the chance of the LLM being misdirected by irrelevant information in the prompt.
Output guidance
LLMs are trained on a large amount of data, and can often come up with a good response with minimal guidance. Sometimes however the format of the response is not what you want. In this case, the best way is to provide examples of the output that you want and pass these in the prompt as guidance.
Zero-shot prompting
Zero-shot prompting relies on the LLM to decide how to output the response based on how it’s been trained. This is ideal for situations like generating code where the output will be formatted based on the huge range of code that the model is trained on.
For example, if you wanted to generate some unit tests for the User
class mentioned earlier, you could add the User
class as context, and ask the LLM:
The output might not be exactly as you would write the code, but it will be good enough to get started.
Few-shot prompting
Few-shot prompting is where you give the LLM a number of examples of how you want the output, and the LLM can use these to create the output in the format you want, interpolating ideas from the examples. This is great for situations where you need a specific format for the output, such as generating data. As long as the examples have a consistent structure to them, the LLM can use these.
This is enough to guide the LLM to create rows with incrementing user Ids (1, 2, and so on up to 100), stored in variables with names that match the Id (user1, user2, and so on up to user100), with random names that are also used in the email addresses.
Techniques for prompt engineering
There are multiple techniques to consider to help you get the most out of the prompts you use with an LLM. Here are a couple of the more important ones.
Prompt chaining
Thinking back to our James Bond example from earlier, in Casino Royale after Bond orders the Vespa martini, another player at the table says “I’ll have one of these”, with someone else following up with “So will I”. The waiter has the conversation history from Bond’s order to know that “I’ll have one of those” means that the player wants a Vespa, and “So will I’ means the next player ordering wants the same as the previous order, which is also a Vespa.
This is the same in the conversations we have with AI chat tools. When we ask a question, that question, and the response become the context for the next question. This allows us to iterate on a prompt by chaining together prompts and giving the LLM follow-up instructions to correct its answer.
As a simple example, imagine you want a User
class in C# that just has Name
, Email
, and PhoneNumber
fields.
You can start by asking the LLM:
The answer might be something like:
Obviously this is in the wrong programming language. You can correct the LLM instead of asking a full new question:
And this will give:
The follow-up prompt doesn’t restate the original question, instead relies on the conversation history to provide relevant context. You can then follow up again:
And this gives
Now this is a very simple example, but it shows the point - you can rely on the conversation history to tell the LLM to ‘correct’ its previous response. As you chain the prompts you will narrow down on the answer you need. You can also use prompt chaining to get additional information that is relevant to the conversation history. For example, after asking for a User
class, you can chain prompts to ask for code to save this to a database or add unit tests, relying on the context of the previous answer that defined the class.
Single vs multiple conversations
You can have multiple concurrent conversations with the AI, so when do you re-use a conversation for your next prompt, and when do you create a new one? The answer is, of course, it depends. With each separate conversation, you can have a separate chain of prompts.
I like to divide conversations by context – if there is nothing in the existing conversation that is a relevant context for my next question, then I start a new conversation.
If I have one conversation about Project A and need to ask a question about Project B then that is a new conversation. Any time I need to add context from a different project, that is a new conversation.
If I’m researching, then each topic is a new conversation. If I’m working on well-defined tasks such as Jira tickets or GitHub issues, then each task is a different conversation.
The upside to this is that it keeps each conversation focused. The downside is that I have a lot of conversations, so finding earlier discussions when task switching can be hard (top tip – name your conversations with a relevant name, such as the Jira ticket number or project).
You also don’t want your conversations to be too long – as mentioned above LLMs have limits on the context window size, and under the hood, LLMs are actually stateless, so the conversation history is implemented by the AI tool, passing the history as context to each conversation.
If the history is too long, the chat tool will need to send just what it thinks is relevant from the history each time. The larger the history, the harder it is to ensure that all the relevant history is passed.
This stateless nature of LLMs is why tools like Pieces allow you to switch LLM mid-conversation. As the history needs to be passed each time, with Pieces you can start a conversation with Claude, then hop to Llama if you lose internet access (such as on a plane), and then back to ChatGPT for a different set of answers.
Chain-of-thought prompting
Chain of thought prompting involves providing the LLM with multi-step instructions to solve a problem, rather than asking for a single answer. These steps align with the kind of steps a human would chain together to solve a problem. This is a very powerful way to get code created when you have some idea upfront of the algorithm you want to use, or if you need a specific output.
This prompt gives 3 defined instructions that guide the LLM.
First, it asks the LLM to consider the core functionality of the class to define what things need to be included in the tests.
Next, it asks the LLM to create the unit tests.
Finally, it guides the LLM to consider edge cases. By giving these specific instructions to the LLM, the LLM will give a better answer than simply asking “Create unit tests for this class”.
You can also try zero-shot chain-of-thought prompting, where you ask a question with the sentence “Let’s think step by step” at the end instead of providing a sequence of steps. This can be enough for the LLM to define the steps itself.
Working with LLMs
LLMs are like most tools – only as powerful as their user.
Give a carpenter a hammer and chisel and you will get a thing of beauty, give me a hammer and chisel and you’ll get a ride to the emergency room to sew my finger back on. To get the most out of an LLM you need to understand how to engineer a good prompt.
The trick is to add the right context, be clear and concise with your questions, and give guidance on the output that you want.
If you then consider advanced techniques like prompt chaining, chain-of-thought prompting, and ensuring you structure your conversations in the best way you’ll soon be crafting the prompts you need to get the answers you want.