The size of the in-context window in tokens varies depending on the LLM. However, it is typically larger than the prompt window size. For example, the GPT4 LLM has a prompt window of 8192 tokens and an in-context window of 32768 tokens. This means that the GPT4 LLM can consider up to 32768 tokens of context when generating a response, even if the prompt is shorter than 8192 tokens.
The larger in-context window size allows LLMs to generate more informative and comprehensive responses. For example, if you ask an LLM to write a summary of a long article, the LLM can use the in-context window to keep track of the main points of the article and generate a summary that is both accurate and complete.
See Also: Context