Context Window
Also known as: Context Length, Context Size, Token Limit
The maximum amount of text an LLM can process at once—the model's working memory that limits how much it can 'see' in a conversation.
Context window is the maximum amount of text (measured in tokens) that a language model can consider at once when generating a response.
Size Matters
| Model | Context Window |
|---|---|
| GPT-3 (2020) | 4K tokens |
| GPT-4 (2023) | 8K-128K tokens |
| Claude 3 (2024) | 200K tokens |
| Gemini 1.5 | 1M+ tokens |
Implications
Larger windows enable:
- Analyzing entire codebases
- Processing long documents
- Maintaining longer conversations
- Including more context for better answers
Limitations:
- Attention may degrade over long contexts
- Compute costs scale with length
- “Lost in the middle” phenomenon
Tokens vs. Words
Roughly: 1 token ≈ 0.75 words (English). A 100K context window holds ~75,000 words or a short novel.
Strategies
When context exceeds the window:
- Summarization
- Retrieval-Augmented Generation (RAG)
- Chunking and processing in parts
External Resources
Related Terms
Related Writing
The Alteration of the C's
May 21, 2026
When Memory Became a Service
May 20, 2026
Mind the Gap: The Shrinking Lead of Closed-Source AI
March 21, 2026
Distributed Agency in Human-AI Systems: A Framework for Analyzing Authorship, Control, and Autonomy
March 18, 2026
The Loss of Terra Firma: Fragmented Consciousness
March 2, 2026