ReadPapers
1.
Introduction
2.
Animate3d: Animating any 3d model with multi-view video diffusion
3.
Particle-Grid Neural Dynamics for Learning Deformable Object Models from RGB-D Videos
4.
HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene
5.
PIG: Physically-based Multi-Material Interaction with 3D Gaussians
6.
EnliveningGS: Active Locomotion of 3DGS
7.
SplineGS: Learning Smooth Trajectories in Gaussian Splatting for Dynamic Scene Reconstruction
8.
PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation
9.
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
10.
LengthAware Motion Synthesis via Latent Diffusion
11.
IKMo: Image-Keyframed Motion Generation with Trajectory-Pose Conditioned Motion Diffusion Model
12.
UniMoGen: Universal Motion Generation
13.
AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion
14.
Flame: Free-form language-based motion synthesis & editing
15.
Human Motion Diffusion as a Generative Prior
16.
Text-driven Human Motion Generation with Motion Masked Diffusion Model
17.
ReMoDiffuse: RetrievalAugmented Motion Diffusion Model
18.
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
19.
ReAlign: Bilingual Text-to-Motion Generation via Step-Aware Reward-Guided Alignment
20.
Absolute Coordinates Make Motion Generation Easy
21.
Seamless Human Motion Composition with Blended Positional Encodings
22.
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing
23.
Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model
24.
Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
25.
StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
26.
EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation
27.
Motion Mamba: Efficient and Long Sequence Motion Generation
28.
M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
29.
T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences
30.
AttT2M:Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism
31.
BAD: Bidirectional Auto-Regressive Diffusion for Text-to-Motion Generation
32.
MMM: Generative Masked Motion Model
33.
Priority-Centric Human Motion Generation in Discrete Latent Space
34.
AvatarGPT: All-in-One Framework for Motion Understanding, Planning, Generation and Beyond
35.
MotionGPT: Human Motion as a Foreign Language
36.
Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Action Generation
37.
PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
38.
Incorporating Physics Principles for Precise Human Motion Prediction
39.
PIMNet: Physics-infused Neural Network for Human Motion Prediction
40.
PhysDiff: Physics-Guided Human Motion Diffusion Model
41.
NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors
42.
Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields
43.
Geometric Neural Distance Fields for Learning Human Motion Priors
44.
Character Controllers Using Motion VAEs
45.
Improving Human Motion Plausibility with Body Momentum
46.
MoGlow: Probabilistic and controllable motion synthesis using normalising flows
47.
Modi: Unconditional motion synthesis from diverse data
48.
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
49.
A deep learning framework for character motion synthesis and editing
50.
Multi-Object Sketch Animation with Grouping and Motion Trajectory Priors
51.
TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos
52.
X-MoGen: Unified Motion Generation across Humans and Animals
53.
Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
54.
MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation
55.
Drop: Dynamics responses from human motion prior and projective dynamics
56.
POMP: Physics-constrainable Motion Generative Model through Phase Manifolds
57.
Dreamgaussian4d: Generative 4d gaussian splatting
58.
Drive Any Mesh: 4D Latent Diffusion for Mesh Deformation from Video
59.
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation
60.
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
61.
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
62.
Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals
63.
Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation
64.
Generating time-consistent dynamics with discriminator-guided image diffusion models
65.
GENMO:AGENeralist Model for Human MOtion
66.
HGM3: HIERARCHICAL GENERATIVE MASKED MOTION MODELING WITH HARD TOKEN MINING
67.
Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion
68.
MoCLIP: Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation
69.
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance
70.
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
71.
DragAnything: Motion Control for Anything using Entity Representation
72.
PhysAnimator: Physics-Guided Generative Cartoon Animation
73.
SOAP: Style-Omniscient Animatable Portraits
74.
Neural Discrete Representation Learning
75.
TSTMotion: Training-free Scene-aware Text-to-motion Generation
76.
Deterministic-to-Stochastic Diverse Latent Feature Mapping for Human Motion Synthesis
77.
A lip sync expert is all you need for speech to lip generation in the wild
78.
MUSETALK: REAL-TIME HIGH QUALITY LIP SYN-CHRONIZATION WITH LATENT SPACE INPAINTING
79.
LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync
80.
T2m-gpt: Generating human motion from textual descriptions with discrete representations
81.
Motiongpt: Finetuned llms are general-purpose motion generators
82.
Guided Motion Diffusion for Controllable Human Motion Synthesis
83.
OmniControl: Control Any Joint at Any Time for Human Motion Generation
84.
Learning Long-form Video Prior via Generative Pre-Training
85.
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
86.
Magic3D: High-Resolution Text-to-3D Content Creation
87.
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
88.
One-Minute Video Generation with Test-Time Training
89.
Key-Locked Rank One Editing for Text-to-Image Personalization
90.
MARCHING CUBES: A HIGH RESOLUTION 3D SURFACE CONSTRUCTION ALGORITHM
91.
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
92.
NULL-text Inversion for Editing Real Images Using Guided Diffusion Models
93.
simple diffusion: End-to-end diffusion for high resolution images
94.
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
95.
Scalable Diffusion Models with Transformers
96.
All are Worth Words: a ViT Backbone for Score-based Diffusion Models
97.
An image is worth 16x16 words: Transformers for image recognition at scale
98.
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
99.
Photorealistic text-to-image diffusion models with deep language understanding||Imagen
100.
DreamFusion: Text-to-3D using 2D Diffusion
101.
GLIGEN: Open-Set Grounded Text-to-Image Generation
102.
Adding Conditional Control to Text-to-Image Diffusion Models
103.
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
104.
Multi-Concept Customization of Text-to-Image Diffusion
105.
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
106.
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
107.
VisorGPT: Learning Visual Prior via Generative Pre-Training
108.
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
109.
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
110.
ModelScope Text-to-Video Technical Report
111.
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
112.
Make-A-Video: Text-to-Video Generation without Text-Video Data
113.
Video Diffusion Models
114.
Learning Transferable Visual Models From Natural Language Supervision
115.
Implicit Warping for Animation with Image Sets
116.
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
117.
Motion-Conditioned Diffusion Model for Controllable Video Synthesis
118.
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
119.
UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation
120.
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
121.
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
122.
A Recipe for Scaling up Text-to-Video Generation
123.
High-Resolution Image Synthesis with Latent Diffusion Models
124.
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
125.
数据集:HumanVid
126.
HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation
127.
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
128.
数据集:Zoo-300K
129.
Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
130.
LORA: LOW-RANK ADAPTATION OF LARGE LAN-GUAGE MODELS
131.
TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
132.
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians
133.
MagicPony: Learning Articulated 3D Animals in the Wild
134.
Splatter a Video: Video Gaussian Representation for Versatile Processing
135.
数据集:Dynamic Furry Animal Dataset
136.
Artemis: Articulated Neural Pets with Appearance and Motion Synthesis
137.
SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
138.
CAT3D: Create Anything in 3D with Multi-View Diffusion Models
139.
PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios
140.
Humans in 4D: Reconstructing and Tracking Humans with Transformers
141.
Learning Human Motion from Monocular Videos via Cross-Modal Manifold Alignment
142.
PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos
143.
Imagic: Text-Based Real Image Editing with Diffusion Models
144.
DiffEdit: Diffusion-based semantic image editing with mask guidance
145.
Dual diffusion implicit bridges for image-to-image translation
146.
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
147.
Prompt-to-Prompt Image Editing with Cross-Attention Control
148.
WANDR: Intention-guided Human Motion Generation
149.
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
150.
3D Gaussian Splatting for Real-Time Radiance Field Rendering
151.
Decoupling Human and Camera Motion from Videos in the Wild
152.
HMP: Hand Motion Priors for Pose and Shape Estimation from Video
153.
HuMoR: 3D Human Motion Model for Robust Pose Estimation
154.
Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video
155.
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation
156.
WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion
157.
Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
158.
Elucidating the Design Space of Diffusion-Based Generative Models
159.
SCORE-BASED GENERATIVE MODELING THROUGHSTOCHASTIC DIFFERENTIAL EQUATIONS
160.
Consistency Models
161.
Classifier-Free Diffusion Guidance
162.
Cascaded Diffusion Models for High Fidelity Image Generation
163.
LEARNING ENERGY-BASED MODELS BY DIFFUSIONRECOVERY LIKELIHOOD
164.
On Distillation of Guided Diffusion Models
165.
Denoising Diffusion Implicit Models
166.
PROGRESSIVE DISTILLATION FOR FAST SAMPLING OF DIFFUSION MODELS
167.
Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
168.
ControlVideo: Training-free Controllable Text-to-Video Generation
169.
Pix2Video: Video Editing using Image Diffusion
170.
Structure and Content-Guided Video Synthesis with Diffusion Models
171.
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
172.
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
173.
Dreamix: Video Diffusion Models are General Video Editors
174.
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
175.
TokenFlow: Consistent Diffusion Features for Consistent Video Editing
176.
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing
177.
Content Deformation Fields for Temporally Consistent Video Processing
178.
PFNN: Phase-Functioned Neural Networks
Light (default)
Rust
Coal
Navy
Ayu
ReadPapers
Flame: Free-form language-based motion synthesis & editing