P108

2.5 Storyboard

P113

IDYearNameNoteTagsLink
842024Learning Long-form Video Prior via Generative Pre-Training利用GPT生成长视频内容的结构化信息,用于帮助下游的视频生成/理解任务。结构化信息,数据集dataset
link
612023Xie et al., “VisorGPT: Learning Visual Prior via Generative Pre-Training,”A “diffusion over diffusion” architecture for very long video generationlink
2023Lin et al., “VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning,”Use storyboard as condition to generate video
✅ Control Net,把文本转为 Pixel 图片。
Dysen-VDM (Fei et al.)
Storyboard through scene graphs
“Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models,” arXiv 2023.
DirectT2V (Hong et al.)
Storyboard through bounding boxes
“Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation,” arXiv 2023.
Free-Bloom (Huang et al.)
Storyboard through detailed text prompts
“Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator,” NeurIPS 2023.
LLM-Grounded Video Diffusion Models (Lian et al.)
Storyboard through foreground bounding boxes
“LLM-grounded Video Diffusion Models,” arXiv 2023.

P104

✅ 生成电影级别的视频,而不是几秒钟的视频。

P106

✅ 文本 → 结构化的中间脚本 → 视频


本文出自CaterpillarStudyGroup,转载请注明出处。

https://caterpillarstudygroup.github.io/ImportantArticles/