ReadPapers
1.
Introduction
2.
FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity
3.
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
4.
ParticleGS: Particle-Based Dynamics Modeling of 3D Gaussians for Prior-free Motion Extrapolation
5.
Animate3d: Animating any 3d model with multi-view video diffusion
6.
Particle-Grid Neural Dynamics for Learning Deformable Object Models from RGB-D Videos
7.
HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene
8.
PIG: Physically-based Multi-Material Interaction with 3D Gaussians
9.
EnliveningGS: Active Locomotion of 3DGS
10.
SplineGS: Learning Smooth Trajectories in Gaussian Splatting for Dynamic Scene Reconstruction
11.
PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation
12.
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
13.
LengthAware Motion Synthesis via Latent Diffusion
14.
IKMo: Image-Keyframed Motion Generation with Trajectory-Pose Conditioned Motion Diffusion Model
15.
UniMoGen: Universal Motion Generation
16.
AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion
17.
Flame: Free-form language-based motion synthesis & editing
18.
Human Motion Diffusion as a Generative Prior
19.
Text-driven Human Motion Generation with Motion Masked Diffusion Model
20.
ReMoDiffuse: RetrievalAugmented Motion Diffusion Model
21.
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
22.
ReAlign: Bilingual Text-to-Motion Generation via Step-Aware Reward-Guided Alignment
23.
Absolute Coordinates Make Motion Generation Easy
24.
Seamless Human Motion Composition with Blended Positional Encodings
25.
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing
26.
Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model
27.
Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
28.
StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
29.
EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation
30.
Motion Mamba: Efficient and Long Sequence Motion Generation
31.
M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
32.
T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences
33.
AttT2M:Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism
34.
BAD: Bidirectional Auto-Regressive Diffusion for Text-to-Motion Generation
35.
MMM: Generative Masked Motion Model
36.
Priority-Centric Human Motion Generation in Discrete Latent Space
37.
AvatarGPT: All-in-One Framework for Motion Understanding, Planning, Generation and Beyond
38.
MotionGPT: Human Motion as a Foreign Language
39.
Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Action Generation
40.
PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
41.
Incorporating Physics Principles for Precise Human Motion Prediction
42.
PIMNet: Physics-infused Neural Network for Human Motion Prediction
43.
PhysDiff: Physics-Guided Human Motion Diffusion Model
44.
NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors
45.
Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields
46.
Geometric Neural Distance Fields for Learning Human Motion Priors
47.
Character Controllers Using Motion VAEs
48.
Improving Human Motion Plausibility with Body Momentum
49.
MoGlow: Probabilistic and controllable motion synthesis using normalising flows
50.
Modi: Unconditional motion synthesis from diverse data
51.
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
52.
A deep learning framework for character motion synthesis and editing
53.
Multi-Object Sketch Animation with Grouping and Motion Trajectory Priors
54.
TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos
55.
X-MoGen: Unified Motion Generation across Humans and Animals
56.
Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
57.
MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation
58.
Drop: Dynamics responses from human motion prior and projective dynamics
59.
POMP: Physics-constrainable Motion Generative Model through Phase Manifolds
60.
Dreamgaussian4d: Generative 4d gaussian splatting
61.
Drive Any Mesh: 4D Latent Diffusion for Mesh Deformation from Video
62.
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation
63.
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
64.
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
65.
Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals
66.
Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation
67.
Generating time-consistent dynamics with discriminator-guided image diffusion models
68.
GENMO:AGENeralist Model for Human MOtion
69.
HGM3: HIERARCHICAL GENERATIVE MASKED MOTION MODELING WITH HARD TOKEN MINING
70.
Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion
71.
MoCLIP: Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation
72.
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance
73.
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
74.
DragAnything: Motion Control for Anything using Entity Representation
75.
PhysAnimator: Physics-Guided Generative Cartoon Animation
76.
SOAP: Style-Omniscient Animatable Portraits
77.
Neural Discrete Representation Learning
78.
TSTMotion: Training-free Scene-aware Text-to-motion Generation
79.
Deterministic-to-Stochastic Diverse Latent Feature Mapping for Human Motion Synthesis
80.
A lip sync expert is all you need for speech to lip generation in the wild
81.
MUSETALK: REAL-TIME HIGH QUALITY LIP SYN-CHRONIZATION WITH LATENT SPACE INPAINTING
82.
LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync
83.
T2m-gpt: Generating human motion from textual descriptions with discrete representations
84.
Motiongpt: Finetuned llms are general-purpose motion generators
85.
Guided Motion Diffusion for Controllable Human Motion Synthesis
86.
OmniControl: Control Any Joint at Any Time for Human Motion Generation
87.
Learning Long-form Video Prior via Generative Pre-Training
88.
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
89.
Magic3D: High-Resolution Text-to-3D Content Creation
90.
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
91.
One-Minute Video Generation with Test-Time Training
92.
Key-Locked Rank One Editing for Text-to-Image Personalization
93.
MARCHING CUBES: A HIGH RESOLUTION 3D SURFACE CONSTRUCTION ALGORITHM
94.
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
95.
NULL-text Inversion for Editing Real Images Using Guided Diffusion Models
96.
simple diffusion: End-to-end diffusion for high resolution images
97.
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
98.
Scalable Diffusion Models with Transformers
99.
All are Worth Words: a ViT Backbone for Score-based Diffusion Models
100.
An image is worth 16x16 words: Transformers for image recognition at scale
101.
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
102.
Photorealistic text-to-image diffusion models with deep language understanding||Imagen
103.
DreamFusion: Text-to-3D using 2D Diffusion
104.
GLIGEN: Open-Set Grounded Text-to-Image Generation
105.
Adding Conditional Control to Text-to-Image Diffusion Models
106.
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
107.
Multi-Concept Customization of Text-to-Image Diffusion
108.
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
109.
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
110.
VisorGPT: Learning Visual Prior via Generative Pre-Training
111.
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
112.
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
113.
ModelScope Text-to-Video Technical Report
114.
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
115.
Make-A-Video: Text-to-Video Generation without Text-Video Data
116.
Video Diffusion Models
117.
Learning Transferable Visual Models From Natural Language Supervision
118.
Implicit Warping for Animation with Image Sets
119.
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
120.
Motion-Conditioned Diffusion Model for Controllable Video Synthesis
121.
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
122.
UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation
123.
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
124.
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
125.
A Recipe for Scaling up Text-to-Video Generation
126.
High-Resolution Image Synthesis with Latent Diffusion Models
127.
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
128.
数据集:HumanVid
129.
HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation
130.
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
131.
数据集:Zoo-300K
132.
Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
133.
LORA: LOW-RANK ADAPTATION OF LARGE LAN-GUAGE MODELS
134.
TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
135.
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians
136.
MagicPony: Learning Articulated 3D Animals in the Wild
137.
Splatter a Video: Video Gaussian Representation for Versatile Processing
138.
数据集:Dynamic Furry Animal Dataset
139.
Artemis: Articulated Neural Pets with Appearance and Motion Synthesis
140.
SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
141.
CAT3D: Create Anything in 3D with Multi-View Diffusion Models
142.
PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios
143.
Humans in 4D: Reconstructing and Tracking Humans with Transformers
144.
Learning Human Motion from Monocular Videos via Cross-Modal Manifold Alignment
145.
PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos
146.
Imagic: Text-Based Real Image Editing with Diffusion Models
147.
DiffEdit: Diffusion-based semantic image editing with mask guidance
148.
Dual diffusion implicit bridges for image-to-image translation
149.
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
150.
Prompt-to-Prompt Image Editing with Cross-Attention Control
151.
WANDR: Intention-guided Human Motion Generation
152.
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
153.
3D Gaussian Splatting for Real-Time Radiance Field Rendering
154.
Decoupling Human and Camera Motion from Videos in the Wild
155.
HMP: Hand Motion Priors for Pose and Shape Estimation from Video
156.
HuMoR: 3D Human Motion Model for Robust Pose Estimation
157.
Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video
158.
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation
159.
WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion
160.
Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
161.
Elucidating the Design Space of Diffusion-Based Generative Models
162.
SCORE-BASED GENERATIVE MODELING THROUGHSTOCHASTIC DIFFERENTIAL EQUATIONS
163.
Consistency Models
164.
Classifier-Free Diffusion Guidance
165.
Cascaded Diffusion Models for High Fidelity Image Generation
166.
LEARNING ENERGY-BASED MODELS BY DIFFUSIONRECOVERY LIKELIHOOD
167.
On Distillation of Guided Diffusion Models
168.
Denoising Diffusion Implicit Models
169.
PROGRESSIVE DISTILLATION FOR FAST SAMPLING OF DIFFUSION MODELS
170.
Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
171.
ControlVideo: Training-free Controllable Text-to-Video Generation
172.
Pix2Video: Video Editing using Image Diffusion
173.
Structure and Content-Guided Video Synthesis with Diffusion Models
174.
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
175.
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
176.
Dreamix: Video Diffusion Models are General Video Editors
177.
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
178.
TokenFlow: Consistent Diffusion Features for Consistent Video Editing
179.
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing
180.
Content Deformation Fields for Temporally Consistent Video Processing
181.
PFNN: Phase-Functioned Neural Networks
Light (default)
Rust
Coal
Navy
Ayu
ReadPapers
数据集:Zoo-300K
该数据集包含约 300,000 对文本描述和跨越 65 个不同动物类别的相应动物运动。
原始数据
Truebones Zoo [2] 数据集
合成数据
对原始数据的动作进行增强
人工标注
用表示动物和运动类别的文本标签进行注释
生成标注
reference
论文:
link