Context Window
Also known as: Context Length, Context Size, Token Limit
The maximum amount of text an LLM can process at once—the model's working memory that limits how much it can 'see' in a conversation.
Context window is the maximum amount of text (measured in tokens) that a language model can consider at once when generating a response.
Size Matters
| Model | Context Window |
|---|---|
| GPT-3 (2020) | 4K tokens |
| GPT-4 (2023) | 8K-128K tokens |
| Claude 3 (2024) | 200K tokens |
| Gemini 1.5 | 1M+ tokens |
Implications
Larger windows enable:
- Analyzing entire codebases
- Processing long documents
- Maintaining longer conversations
- Including more context for better answers
Limitations:
- Attention may degrade over long contexts
- Compute costs scale with length
- “Lost in the middle” phenomenon
Tokens vs. Words
Roughly: 1 token ≈ 0.75 words (English). A 100K context window holds ~75,000 words or a short novel.
Strategies
When context exceeds the window:
- Summarization
- Retrieval-Augmented Generation (RAG)
- Chunking and processing in parts
External Resources
Related Terms
Related Writing
The Shifting Value Proposition in the AI Era
December 24, 2025
Correction Penalty - The Plane of Infinite Tweaking
December 1, 2025
AI Ecosystem Capital Flows (updated)
November 20, 2025
Meta's Mind-Reading AI Sparks Urgent Call for Brain Data Privacy
November 13, 2025
Memetic Lexicon
October 7, 2025