3.3 Controlled Edifng (depth/pose/point/ControlNet)

✅ 已有一段视频,通过 guidance 或文本描述,修改视频。

P189

P190

Depth Control

Depth estimating network

IDYearNameNoteTagsLink
2022Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer✅ 深变信息 Encode 成 latent code, 与 noise concat 到一起。
1222023Structure and Content-Guided Video Synthesis with Diffusion ModelsTransfer the style of a video using text prompts given a “driving video”,以多种形式在预训练图像扩散模型中融入时序混合层进行扩展Gen-1, Framewise, depth-guided
1232023Pix2Video: Video Editing using Image DiffusionFramewise depth-guided video editing

P199

ControlNet / Multiple Control

也是control net 形式,但用到更多控制条件。

IDYearNameNoteTagsLink
1242023ControlVideo: Training-free Controllable Text-to-Video Generation
2023VideoControlNet: A Motion-Guided Video-to-Video Translation Framework by Using Diffusion Model with ControlNetOptical flow-guided video editing; I, P, B frames in video compression
✅ 内容一致性,适用于 style transfer, 但需要对物体有较大编辑力度时不适用(例如编辑物体形状)。
2023CCEdit: Creative and Controllable Video Editing via Diffusion Models
2023VideoComposer: Compositional Video Synthesis with Motion ControllabilityImage-, sketch-, motion-, depth-, mask-controlled video editing
✅ 每个 condition 进来,都过一个 STC-Encoder, 然后把不同 condition fuse 到一起,输入到 U-Net.
Spako-Temporal Condikon encoder (STC-encoder): a unified input interface for condikons

2023Control-A-Video: Controllable Text-to-Video Generagon with Diffusion Models通过边缘图或深度图等序列化控制信号生成视频,并提出两种运动自适应噪声初始化策略
2024Vmc: Video motion customization using temporal attention adaption for text-to-video diffusion models.轨迹控制
2023MagicProp: Diffusion-based Video Editing via Motion-aware Appearance Propagation
2023Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
2023MagicEdit: High-Fidelity and Temporally Coherent Video Editing
2023EVE: Efficient zero-shot text-based Video Editing with Depth Map Guidance and Temporal Consistency Constraints

P225

Point-Control

IDYearNameNoteTagsLink
982023VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

P226


本文出自CaterpillarStudyGroup,转载请注明出处。

https://caterpillarstudygroup.github.io/ImportantArticles/