Introduction
This article explores an advanced Prompt Engineering technique known as Incontext learning, a method used to achieve precise results from Large Language Models (LLMs) without the need for extensive model training. This cost-effective approach leverages the LLM’s capabilities with minimal data collection, making it ideal for specific use cases.
In-context learning gained popularity through the original GPT-3 paper, which introduced it as a method for language models to learn tasks from just a few examples. This process involves providing the language model (LM) with a prompt containing a series of input-output pairs that exemplify a particular task. Following these examples, a test input is appended to the prompt. The LM then makes a prediction based solely on the provided prompt, determining the next tokens through conditioning on this prompt.
Note: Sections of text shown in italics and green are prompts that can be entered verbatim in a chat application such as ChatGPT, Bard or other.
Prompt provided including labeled examples | Response from GenAI system |
You are a chatbot. Customers chat with you. Here are some examples of conversation samples and the classification to assign. Your job is to assign one of the two classifications: Interested or Not_Interested. Customer: “I am interested in learning more” Chatbot Classification = Interested Customer: “Sure, tell me more” Chatbot Classification = Interested Customer: “I would love to know more” Chatbot Classification = Interested Customer: “I am not interested” Chatbot Classification = Not_Interested Customer: “I am in the middle of something. Can we talk some other time?” Chatbot Classification = Not_Interested Customer: “I already have someone helping me” Chatbot Classification = Not_Interested Customer: “I am doing my own management” Chatbot Classification = Not_Interested Customer: “No thanks” Chatbot Classification = Not_Interested |
|
Customer: “dont waste my time” Chatbot Classification = Chatbot Classification = Chatbot Classification = Chatbot Classification = Chatbot Classification = |
Based on the provided examples and the classifications assigned to them, here are the classifications for the remaining customer responses: Customer: “dont waste my time” Customer: “do you have nothing better to do?” Customer: “Today is not a good day for me as I have to walk my dog and run some errands. Maybe some other day we can try and connect.” Customer: “This is such good timing. I just won the lottery last week and I have been searching and going to church for something.” Customer: “The price of milk is going up all the time. Now even the chickens are on strike and wont lay eggs.” |
The highlighted items do not have a classification in the original prompt. The response has those items filled out. Based on the above example you can see how a few examples are enough to train the LLM. The model then starts inferencing the category when new input is given.
Now if I submit this prompt as follow on, the system responds accurately.
Customer: “My family is taking up a lot of time.”
Classification =
Completion from ChatGPT:
Classification = Not_Interested
Incontext Learning as compared to traditional supervised learning
Let us now discuss the benefits of incontext learning as compared to traditional supervised machine learning.
In-context learning in language models like GPT-4 has several advantages over traditional supervised learning classifiers:
Versatility: It enables models to tackle diverse tasks without task-specific training, unlike classifiers that require specific training datasets.
Reduced Data Needs: Only a few examples are needed for guidance, as opposed to large, labeled datasets required by traditional classifiers.
Quick Adaptation: Language models can swiftly adjust to new tasks with minimal examples, whereas classifiers often need retraining or fine-tuning.
Lower Training Costs: In-context learning avoids the computational expense of retraining for each new task, making it more efficient than classifiers that demand extensive computational resources.
Handling Complex Language: These models excel in understanding and generating nuanced language, outperforming classifiers in tasks involving natural language processing.
Continuous Learning: Language models can continuously integrate new information without explicit retraining, unlike classifiers that may become outdated.
Easy Customization: They can be easily tailored for specific tasks using prompts, offering more flexibility than classifiers that might need complete retraining for customization.
While in-context learning is versatile and efficient, its performance depends on the quality of examples and may be less predictable in outputs compared to specialized classifiers.
Terminology
Classification | As exemplified above, by providing incontext learning examples of two or more classes we are easily able to get the Gen AI system to start behaving like a typical Machine Learning Classifier. A classifier takes an input and assigns a specific category to it out of two available. In the example above, it assigns either Interested or Not_Interested. |
Incontext learning | It is a newer term when compared to Few-shot learning. Both terms refer to the technique of providing a large language model (LLM) with a small number of examples of the desired behavior, and the model is then able to generalize to new examples. |
Few shot learning | It is an older term as compared to Incontext learning. See Incontext learning entry. |
In-context learning is the same as few-shot learning. Both terms refer to the technique of providing a large language model (LLM) with a small number of examples of the desired behavior, and the model is then able to generalize to new examples.
In-context learning is a more recent term that has been used to emphasize the fact that the examples are provided in the context of the prompt. This is in contrast to traditional few-shot learning, where the examples are provided separately from the prompt.
Here is a table that summarizes the key similarities and differences between in-context learning and few-shot learning:
Feature | In-context learning | Few-shot learning |
Examples provided | In the context of the prompt | Separately from the prompt |
Model weights updated | No | No |
Flexibility | High | High |
Computational expense | Low | Low |
In general, the terms in-context learning and few-shot learning can be used interchangeably. Both terms refer to the same powerful technique for adapting LLMs to a wide variety of tasks.
Summary
In-context learning stands out for its cost-effectiveness and efficiency, eliminating the need for training your own custom models. Impressive performance can be achieved by utilizing LLMs with minimal data collection.
We presented a practical example of the application of in-context learning in classifying customer responses as ‘Interested’ or ‘Not_Interested’. This example showcases the model’s ability to accurately classify new inputs based on a few contextual examples. A simple solution like this can be used for analyzing call transcripts of outbound sales calls, as an example.
The article also clarifies the relationship between in-context learning and few-shot learning, emphasizing their similarities and subtle differences. While both techniques rely on a small number of examples to guide LLMs, in-context learning specifically integrates these examples within the prompt, enhancing relevance and contextual understanding.
Terms to remember
Advanced Prompting Techniques, Incontext Learning, Few-shot learning, Input-Output Pairs, Test Input, Token Prediction, Conditioning, Classification, Model Weights
Introduction to Generative AI Series
Part 1 – The Magic of Generative AI: Explained for Everyday Innovators
Part 2 – Your How-To Guide for Effective Writing with Language Models
Part 3 – Precision Crafted: Mastering the Art of Prompt Engineering for Pinpoint Results
Part 4 (this article) – Precision Crafted Prompts: Mastering Incontext Learning
Part 5 – Zero-Shot Learning: Bye-bye to Custom ML Models?