Autoregressive models

Also known as decoder only models. See entry: Decoder only models.

Autoregressive models are predictive models used in statistics and machine learning. They predict future values in a sequence based on past values. These models are “autoregressive” because they regress (or model) the variable of interest on its own prior values. In the context of machine learning and particularly in natural language processing, autoregressive models predict the next element in a sequence, such as a word (technically they predict the next token) in a sentence, based on the sequence of elements that came before it.

The GPT, or Generative Pretrained Transformer, is an example of an autoregressive model. It generates text by predicting the next word in a sequence based on the words that precede it. Each new word is predicted based on all previously predicted words, making it autoregressive. This method allows GPT to generate coherent and contextually relevant text sequences.

A concrete example illustrating the autoregressive nature of GPT can be seen in text completion tasks. Suppose you give GPT the beginning of a sentence: “The quick brown fox”. GPT4 will predict the next word in the sequence (e.g., “jumps”) based on the given words. It will continue generating words one after another (“over the lazy dog”), each time considering the entire sequence of words generated so far to predict the next word. This process is inherently autoregressive, as each prediction is dependent on the previous sequence of words.

A natural question that arises is, “How can an autoregressive model like GPT handle question-answering tasks, typically associated with encoder-decoder models?” The answer lies in the adaptability of GPT models. While they are general-purpose language models, they can undergo fine-tuning for specific tasks, including question-answering. This fine-tuning process involves additional training on a targeted dataset, such as a question-answer collection, significantly improving the model’s capability to produce precise and context-relevant answers.

<< Return to Glossary

Autoregressive models

Subscribe to our newsletter

Subscribe to our newsletter