AI & Generative Media

Context Window

Also known as: Context Length, Context Size, Token Limit

The maximum amount of text an LLM can process at once—the model's working memory that limits how much it can 'see' in a conversation.

Context window is the maximum amount of text (measured in tokens) that a language model can consider at once when generating a response.

Size Matters

ModelContext Window
GPT-3 (2020)4K tokens
GPT-4 (2023)8K-128K tokens
Claude 3 (2024)200K tokens
Gemini 1.51M+ tokens

Implications

Larger windows enable:

  • Analyzing entire codebases
  • Processing long documents
  • Maintaining longer conversations
  • Including more context for better answers

Limitations:

  • Attention may degrade over long contexts
  • Compute costs scale with length
  • “Lost in the middle” phenomenon

Tokens vs. Words

Roughly: 1 token ≈ 0.75 words (English). A 100K context window holds ~75,000 words or a short novel.

Strategies

When context exceeds the window:

  • Summarization
  • Retrieval-Augmented Generation (RAG)
  • Chunking and processing in parts