A deep learning framework for character motion synthesis and editing

核心问题是什么?

开创了Deep Learning Based运动生成的先河

大致方法

第一步：Building the Motion Manifold

输入	输出	方法
动作数据，T*D1	latent data，T*D2(256)	一维（时间维度）卷积 Based Encoder
latent data，T*D2(256)	重建动作数据，T*D1	一维（时间维度）卷积 Based Decoder
动作数据重建动作数据网络参数	Loss	L2 Loss + L1正则化

关键创新：Max Pooling。实验表明Max Pooling对结果起到了较大的提升作用。

---
title: Building the Motion Manifold
---
flowchart LR
    Input[("动作数据")]
    Encoder
    Latent(["Latent Code"])
    Decoder
    Output(["Output"])
    Loss(["Loss"])

    Input-->Encoder-->Latent-->Decoder-->Output
    Input & Output --> Loss

    
    
    Loss e1@-->Encoder 
    Loss e2@-->Decoder

    e1@{ animation: fast }
    e2@{ animation: fast }

第二步：Mapping High Level Parameters to Human Motions

输入	输出	方法
高级控制参数，例如轨迹 frequency	触地信息，T * 4	正弦函数
高级控制参数，例如轨迹触地信息，T * 4	latent data，T*D2(256)	卷积based网络
latent data，T*D2(256)	重建动作数据，T*D1	一维（时间维度）卷积 Based Decoder，fixed
重建动作数据 GT	Loss

---
title: Mapping High Level Parameters to Human Motions
---
flowchart LR
    Input[("控制信息")]
    NN["input curve T -> parameters"]
    Wave(["Wave Parameters"])
    SquareWaves["square waves"]
    F(["触地信息"])
    FF["Feedforward Network"]
    Latent(["Latent Code"])
    Decoder
    Output(["Output"])
    GT[("GT")]
    Loss(["Loss"])

    Input-->NN-->Wave-->SquareWaves-->F
    Input & F --> FF --> Latent-->Decoder-->Output
    Output & GT --> Loss

    Loss e1@-->FF 
    e1@{ animation: fast }

应用

Applying Constraints in Hidden Unit Space

---
title: Applying Constraints in Hidden Unit Space
---
flowchart LR
    Init(["初始Lantet Motion"])
    Decoder
    Output(["Output"])
    Constrain[("约束")]
    Loss(["Loss"])

    Init-->Decoder-->Output
    Output & Constrain --> Loss

    Loss e1@-->Init 
    e1@{ animation: fast }

Motion Stylization in Hidden Unit Space

---
title: Motion Stylization in Hidden Unit Space
---
flowchart LR
    Init(["初始Lantet Motion"])
    Style[("风格条件")]
    Content[("内容条件")]
    LatentContent(["latent内容条件"])
    LatentStyle(["latent风格条件"])
    GMStyle(["输入风格的Gram matrix"])
    GMMotion(["当前Motion风格的Gram matrix"])
    LossContent(["内容Loss"])
    LossStyle(["风格Loss"])
    Loss(["Loss"])

    Encoder1["Encoder"]
    Encoder2["Encoder"]
    GM1["计算Gram Matrix"]
    GM2["计算Gram Matrix"]

    Content --> Encoder1 --> LatentContent
    Style --> Encoder2 --> LatentStyle
    Init & LatentContent --> LossContent 
    Init --> GM1 --> GMMotion--> LossStyle 
    LatentStyle --> GM2 --> GMStyle --> LossStyle 
    LossContent & LossStyle --> Loss

    Loss e1@-->Init 
    e1@{ animation: fast }

实验

数据集

CMU
自采数据 + 重定向

数据预处理流程

时序标准化：统一降采样至60FPS保证时序一致性
空间表征转换：将关节角度表示→3D关节位置（局部坐标系）
坐标系构建：以根关节地面投影为原点，通过肩/臀部向量计算前进方向（Z轴）
运动学参数增强：添加全局速度（XZ平面）、旋转速度（Y轴）和足部接触标签
数据归一化：减去均值/除以标准差（分别处理姿态、速度、接触标签）

总结

核心价值：第一篇基于AI的3D骨骼动作生成工作

成本分析：需要特定角色的大量数据

落地瓶颈：需要特定角色的大量数据，生成动作也只能用于特定角色，没有角色之间的泛化性。

ReadPapers