Prompt engineering is a crucial technique for improving the performance of large language models (LLMs), allowing users to customize the model’s behavior to specific tasks or domains. By designing effective prompts, users can leverage the full potential of LLMs and achieve state-of-the-art results across a wide range of natural language processing applications.

On May 2023 OpenAI in collaboration with set up a great introductory course on prompt engineering. Certainly worth going through for anyone interested. Find the course HERE.

This article will first give a short introduction into prompt engineering and then quickly dive into the deep end about how to optimize prompts for the best results.

What is prompt engineering?

Receiving answers from Large Language Models (LLMs) is done throught the use of asking questions (prompts). Although some LLMs sucht as chatGPT have an ‘allignment’ layer that shapes prompts to be better intepreted by the model this does not always work well and can have averse effects. Idealy you want to work directly with the model to have full control. This results in a different challenge of ‘garbage in -> garbage out‘.

To optimally use LLMs a user needs to be an expert at prompt engineering. This is achieved by understanding the workings of LLMs and an advanced understanding of what truely is being asked (and what is expected to be returned). It is important to stay skeptical about the received answers such as shown in Figure 1.

Figure 1: Logical reasoning questions can be hard for LLMs. It’s not impossible but it requires a consideration of prompt engineering to get consistantly accurate answers.

In this article examples will be generated using chatGPT, but serious prompt engineering is done through the API or the OpenAI Playground. They give access to the models directly with detailed settings adjustments. See THIS github repository for an intruduction of the technical aspects of setting up a model.

Advanced Techniques For Prompt Engineering

This article assumes that you know how to access the LLM of your preference and ask it basic questions. OpenAI gives the tips to improve results, here are the ones I found most effective:

  1. Use the latest models

For best results, we generally recommend using the latest, most capable models. As of November 2022, the best options are the “text-davinci-003” model for text generation, and the “code-davinci-002” model for code generation.

2. Be specific, descriptive and describe in detail the desired context, outcome, length, format, style, etc

Instead of a simple prompt such as ‘write a poem about OpenAI’ try:

Write a short inspiring poem about OpenAI, focusing on the recent DALL-E product launch (DALL-E is a text to image ML model) in the style of a {famous poet}

3. Articulate the desired output format through examples

Instead of asking a question such as ‘extract the four type of entities out of the text, try:

Extract the important entities mentioned in the text below. First extract all company names, then extract all people names, then extract specific topics which fit the content and finally extract general overarching themes

Desired format:
Company names: <comma_separated_list_of_company_names>
People names: -||-
Specific topics: -||-
General themes: -||-

Text: {text}

Let’s dive into the deep end and describe 4 advanced techniques to get the best results.

  • Few-shot prompts
  • Chain-of-Thought (CoT) prompting
  • Self-Consistancy
  • Knowledge Generation Prompting

Few-Shot prompts

Few-shot prompting means asking a few questions (shots) to prime the model to generate better answers for similar but more complex questions. If you ask complex logic questions it will likely be wrong. If instead you first ask a few more fundamental logic questions it significantly increases the chance it will answer complex logic questions later on.

Figure 2: Giving a model a few example questions can increase accuracy on future questions.

Applying few-shot learning we can fix the mistake GPT made earlier on in this article, as seen in Figure 3:

Figure 3: After asking a few simple logic questions the model can now answer this logic question correctly.

Chain-of-Thought (CoT) prompting

Responses can be improved more (in combination with few-shot prompts) by asking the model to share its chain of thoughts. This gives insight into how it achieves its results.

Figure 4: quite an extensive chain of thought as would be expected of such a complex question. it gives a clear insight into how the answer was attained.


As THIS paper describes,  for self-consistancy it first samples a diverse set of reasoning paths instead of only taking the greedy one, and then selects the most consistent answer by marginalizing out the sampled reasoning paths.

Figure 5: instead of directly choosing the most ‘greedy’ path as is standard it reviews multiple paths and picks the one that gives the most consistant answer.

Applying this concept in your promt design is to simply run the same question multiple times and see what the most common output is. This process can be automated to run important prompts x times and choose the most consistant answer as the correct one.

Knowledge Generation Prompting

The ‘generated knowledge’ approach for prompt design involves utilizing a pre-trained language model to generate examples or data related to the task or problem at hand, which are then used to construct prompts for the model to complete the task or solve the problem. This approach has been shown to be effective in improving the performance of language models on a variety of natural language processing tasks.

Figure 6: This example shows that first generating some facts improves the resulting output.

These are some of the best and easy to incorporate improvements to prompt writing to see significant improvements. For further learning I Advice the OpenAI website and

Leave a Reply

Your email address will not be published. Required fields are marked *