AI & Generative Media

Text-to-Video

Also known as: Video Generation, AI Video, T2V

AI systems that generate video content from text descriptions, representing a frontier in generative AI.

Text-to-video AI generates video content from written descriptions, extending generative AI from static images to motion.

Current Systems

  • Sora (OpenAI): High-fidelity, longer clips
  • Runway Gen-3: Creative and commercial use
  • Pika: Consumer-focused generation
  • Kling (Kuaishou): Chinese alternative
  • Veo (Google): Research-stage model

Capabilities

  • Generate scenes from descriptions
  • Animate still images
  • Extend existing clips
  • Style transfer on video
  • Character consistency across shots

Limitations

  • Physics inconsistencies (hands, gravity)
  • Temporal coherence (objects appearing/disappearing)
  • Length constraints (seconds, not minutes)
  • Compute-intensive generation
  • Control precision

Implications

  • Democratizes video production
  • Threatens stock footage industry
  • Raises deepfake concerns
  • Transforms creative workflows
  • Questions about copyright of outputs

External Resources