P108

2.5 Storyboard

P113

ID	Year	Name	Note	Tags	Link
84	2024	Learning Long-form Video Prior via Generative Pre-Training	利用GPT生成长视频内容的结构化信息，用于帮助下游的视频生成/理解任务。	结构化信息，数据集	dataset link
61	2023	Xie et al., “VisorGPT: Learning Visual Prior via Generative Pre-Training,”	A “diffusion over diffusion” architecture for very long video generation		link
	2023	Lin et al., “VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning,”	Use storyboard as condition to generate video ✅ Control Net，把文本转为 Pixel 图片。


	Dysen-VDM (Fei et al.) Storyboard through scene graphs “Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models,” arXiv 2023.
	DirectT2V (Hong et al.) Storyboard through bounding boxes “Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation,” arXiv 2023.
	Free-Bloom (Huang et al.) Storyboard through detailed text prompts “Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator,” NeurIPS 2023.
	LLM-Grounded Video Diffusion Models (Lian et al.) Storyboard through foreground bounding boxes “LLM-grounded Video Diffusion Models,” arXiv 2023.

P104

✅ 生成电影级别的视频，而不是几秒钟的视频。

P106

✅ 文本 → 结构化的中间脚本 → 视频

本文出自CaterpillarStudyGroup，转载请注明出处。

https://caterpillarstudygroup.github.io/ImportantArticles/