1. ReadPapers
  2. 1. Introduction
  3. 2. Locomotion 技术洞察
  4. 3. AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control
  5. 4. ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters
  6. 5. Feature-Based Locomotion Controllers
  7. 6. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
  8. 7. ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters
  9. 8. Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion
  10. 9. UniPhys Unified Planner and Controller with Diffusion for Flexible
  11. 10. Diffuse-CLoC: Guided Diffusion for Physics-based Character Look-ahead
  12. 11. PDP: Physics-Based Character Animation via Diffusion Policy
  13. 12. DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets
  14. 13. Perpetual Humanoid Control for Real-time Simulated Avatars
  15. 14. Calm: Conditional Adversarial Latent Models for Directable Virtual Characters
  16. 15. Universal humanoid motion representations for physics-based control
  17. 16. DReCon: data-driven responsive control of physics-based characters
  18. 17. PARC: Physics-based Augmentation with Reinforcement Learning for Character Controllers
  19. 18. CLOSD: CLOSING THE LOOP BETWEEN SIMULATION AND DIFFUSION FOR MULTI-TASK CHARACTER CONTROL
  20. 19. MotionPersona: Characteristics-aware Locomotion Control
  21. 20. Diffuse-CLoC Guided Diffusion for Physics-based Character Look-ahead
  22. 21. Gait-Conditioned Reinforcement Learning with Multi-Phase Curriculum for Humanoid Locomotion
  23. 22. UniPhys: Unified Planner and Controller with Diffusion for Flexible
  24. 23. Maskedmimic: Unified physics-based character control through masked motion
  25. 24. Regional Time Stepping for SPH
  26. 25. FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity
  27. 26. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
  28. 27. ParticleGS: Particle-Based Dynamics Modeling of 3D Gaussians for Prior-free Motion Extrapolation
  29. 28. Animate3d: Animating any 3d model with multi-view video diffusion
  30. 29. Particle-Grid Neural Dynamics for Learning Deformable Object Models from RGB-D Videos
  31. 30. HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene
  32. 31. PIG: Physically-based Multi-Material Interaction with 3D Gaussians
  33. 32. EnliveningGS: Active Locomotion of 3DGS
  34. 33. SplineGS: Learning Smooth Trajectories in Gaussian Splatting for Dynamic Scene Reconstruction
  35. 34. PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation
  36. 35. PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
  37. 36. LengthAware Motion Synthesis via Latent Diffusion
  38. 37. IKMo: Image-Keyframed Motion Generation with Trajectory-Pose Conditioned Motion Diffusion Model
  39. 38. UniMoGen: Universal Motion Generation
  40. 39. AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion
  41. 40. Flame: Free-form language-based motion synthesis & editing
  42. 41. Human Motion Diffusion as a Generative Prior
  43. 42. Text-driven Human Motion Generation with Motion Masked Diffusion Model
  44. 43. ReMoDiffuse: RetrievalAugmented Motion Diffusion Model
  45. 44. MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
  46. 45. ReAlign: Bilingual Text-to-Motion Generation via Step-Aware Reward-Guided Alignment
  47. 46. Absolute Coordinates Make Motion Generation Easy
  48. 47. Seamless Human Motion Composition with Blended Positional Encodings
  49. 48. FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing
  50. 49. Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model
  51. 50. Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
  52. 51. StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
  53. 52. EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation
  54. 53. Motion Mamba: Efficient and Long Sequence Motion Generation
  55. 54. M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
  56. 55. T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences
  57. 56. AttT2M:Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism
  58. 57. BAD: Bidirectional Auto-Regressive Diffusion for Text-to-Motion Generation
  59. 58. MMM: Generative Masked Motion Model
  60. 59. Priority-Centric Human Motion Generation in Discrete Latent Space
  61. 60. AvatarGPT: All-in-One Framework for Motion Understanding, Planning, Generation and Beyond
  62. 61. MotionGPT: Human Motion as a Foreign Language
  63. 62. Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Action Generation
  64. 63. PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
  65. 64. Incorporating Physics Principles for Precise Human Motion Prediction
  66. 65. PIMNet: Physics-infused Neural Network for Human Motion Prediction
  67. 66. PhysDiff: Physics-Guided Human Motion Diffusion Model
  68. 67. NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors
  69. 68. Riemannian Motion Generation: A Unified Framework for Human Motion Representation and Generation via Riemannian Flow Matching
  70. 69. Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields
  71. 70. Geometric Neural Distance Fields for Learning Human Motion Priors
  72. 71. Character Controllers Using Motion VAEs
  73. 72. Improving Human Motion Plausibility with Body Momentum
  74. 73. MoGlow: Probabilistic and controllable motion synthesis using normalising flows
  75. 74. Modi: Unconditional motion synthesis from diverse data
  76. 75. MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
  77. 76. A deep learning framework for character motion synthesis and editing
  78. 77. Multi-Object Sketch Animation with Grouping and Motion Trajectory Priors
  79. 78. TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos
  80. 79. X-MoGen: Unified Motion Generation across Humans and Animals
  81. 80. Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
  82. 81. MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation
  83. 82. Drop: Dynamics responses from human motion prior and projective dynamics
  84. 83. POMP: Physics-constrainable Motion Generative Model through Phase Manifolds
  85. 84. Dreamgaussian4d: Generative 4d gaussian splatting
  86. 85. Drive Any Mesh: 4D Latent Diffusion for Mesh Deformation from Video
  87. 86. AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation
  88. 87. ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
  89. 88. Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
  90. 89. Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals
  91. 90. Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation
  92. 91. Generating time-consistent dynamics with discriminator-guided image diffusion models
  93. 92. GENMO:AGENeralist Model for Human MOtion
  94. 93. HGM3: HIERARCHICAL GENERATIVE MASKED MOTION MODELING WITH HARD TOKEN MINING
  95. 94. Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion
  96. 95. MoCLIP: Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation
  97. 96. FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance
  98. 97. VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
  99. 98. DragAnything: Motion Control for Anything using Entity Representation
  100. 99. PhysAnimator: Physics-Guided Generative Cartoon Animation
  101. 100. SOAP: Style-Omniscient Animatable Portraits
  102. 101. Neural Discrete Representation Learning
  103. 102. TSTMotion: Training-free Scene-aware Text-to-motion Generation
  104. 103. Deterministic-to-Stochastic Diverse Latent Feature Mapping for Human Motion Synthesis
  105. 104. A lip sync expert is all you need for speech to lip generation in the wild
  106. 105. MUSETALK: REAL-TIME HIGH QUALITY LIP SYN-CHRONIZATION WITH LATENT SPACE INPAINTING
  107. 106. LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync
  108. 107. T2m-gpt: Generating human motion from textual descriptions with discrete representations
  109. 108. Motiongpt: Finetuned llms are general-purpose motion generators
  110. 109. Guided Motion Diffusion for Controllable Human Motion Synthesis
  111. 110. OmniControl: Control Any Joint at Any Time for Human Motion Generation
  112. 111. Learning Long-form Video Prior via Generative Pre-Training
  113. 112. Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
  114. 113. Magic3D: High-Resolution Text-to-3D Content Creation
  115. 114. CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
  116. 115. One-Minute Video Generation with Test-Time Training
  117. 116. Key-Locked Rank One Editing for Text-to-Image Personalization
  118. 117. MARCHING CUBES: A HIGH RESOLUTION 3D SURFACE CONSTRUCTION ALGORITHM
  119. 118. Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
  120. 119. NULL-text Inversion for Editing Real Images Using Guided Diffusion Models
  121. 120. simple diffusion: End-to-end diffusion for high resolution images
  122. 121. One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
  123. 122. Scalable Diffusion Models with Transformers
  124. 123. All are Worth Words: a ViT Backbone for Score-based Diffusion Models
  125. 124. An image is worth 16x16 words: Transformers for image recognition at scale
  126. 125. eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
  127. 126. Photorealistic text-to-image diffusion models with deep language understanding||Imagen
  128. 127. DreamFusion: Text-to-3D using 2D Diffusion
  129. 128. GLIGEN: Open-Set Grounded Text-to-Image Generation
  130. 129. Adding Conditional Control to Text-to-Image Diffusion Models
  131. 130. T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
  132. 131. Multi-Concept Customization of Text-to-Image Diffusion
  133. 132. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
  134. 133. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
  135. 134. VisorGPT: Learning Visual Prior via Generative Pre-Training
  136. 135. NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
  137. 136. AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
  138. 137. ModelScope Text-to-Video Technical Report
  139. 138. Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
  140. 139. Make-A-Video: Text-to-Video Generation without Text-Video Data
  141. 140. Video Diffusion Models
  142. 141. Learning Transferable Visual Models From Natural Language Supervision
  143. 142. Implicit Warping for Animation with Image Sets
  144. 143. Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
  145. 144. Motion-Conditioned Diffusion Model for Controllable Video Synthesis
  146. 145. Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
  147. 146. UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation
  148. 147. Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
  149. 148. Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
  150. 149. A Recipe for Scaling up Text-to-Video Generation
  151. 150. High-Resolution Image Synthesis with Latent Diffusion Models
  152. 151. Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
  153. 152. 数据集:HumanVid
  154. 153. HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation
  155. 154. StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
  156. 155. 数据集:Zoo-300K
  157. 156. Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
  158. 157. LORA: LOW-RANK ADAPTATION OF LARGE LAN-GUAGE MODELS
  159. 158. TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
  160. 159. GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians
  161. 160. MagicPony: Learning Articulated 3D Animals in the Wild
  162. 161. Splatter a Video: Video Gaussian Representation for Versatile Processing
  163. 162. 数据集:Dynamic Furry Animal Dataset
  164. 163. Artemis: Articulated Neural Pets with Appearance and Motion Synthesis
  165. 164. SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
  166. 165. CAT3D: Create Anything in 3D with Multi-View Diffusion Models
  167. 166. PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios
  168. 167. Humans in 4D: Reconstructing and Tracking Humans with Transformers
  169. 168. Learning Human Motion from Monocular Videos via Cross-Modal Manifold Alignment
  170. 169. PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos
  171. 170. Imagic: Text-Based Real Image Editing with Diffusion Models
  172. 171. DiffEdit: Diffusion-based semantic image editing with mask guidance
  173. 172. Dual diffusion implicit bridges for image-to-image translation
  174. 173. SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
  175. 174. Prompt-to-Prompt Image Editing with Cross-Attention Control
  176. 175. WANDR: Intention-guided Human Motion Generation
  177. 176. TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
  178. 177. 3D Gaussian Splatting for Real-Time Radiance Field Rendering
  179. 178. Decoupling Human and Camera Motion from Videos in the Wild
  180. 179. HMP: Hand Motion Priors for Pose and Shape Estimation from Video
  181. 180. HuMoR: 3D Human Motion Model for Robust Pose Estimation
  182. 181. Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video
  183. 182. Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation
  184. 183. WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion
  185. 184. Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
  186. 185. Elucidating the Design Space of Diffusion-Based Generative Models
  187. 186. SCORE-BASED GENERATIVE MODELING THROUGHSTOCHASTIC DIFFERENTIAL EQUATIONS
  188. 187. Consistency Models
  189. 188. Classifier-Free Diffusion Guidance
  190. 189. Cascaded Diffusion Models for High Fidelity Image Generation
  191. 190. LEARNING ENERGY-BASED MODELS BY DIFFUSIONRECOVERY LIKELIHOOD
  192. 191. On Distillation of Guided Diffusion Models
  193. 192. Denoising Diffusion Implicit Models
  194. 193. PROGRESSIVE DISTILLATION FOR FAST SAMPLING OF DIFFUSION MODELS
  195. 194. Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
  196. 195. ControlVideo: Training-free Controllable Text-to-Video Generation
  197. 196. Pix2Video: Video Editing using Image Diffusion
  198. 197. Structure and Content-Guided Video Synthesis with Diffusion Models
  199. 198. MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
  200. 199. MotionDirector: Motion Customization of Text-to-Video Diffusion Models
  201. 200. Dreamix: Video Diffusion Models are General Video Editors
  202. 201. Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
  203. 202. TokenFlow: Consistent Diffusion Features for Consistent Video Editing
  204. 203. DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing
  205. 204. Content Deformation Fields for Temporally Consistent Video Processing
  206. 205. PFNN: Phase-Functioned Neural Networks
  207. 206. Recurrent Transition Networks for Character Locomotion
  208. 207. Real-Time Style Modelling of Human Locomotion
  209. 208. Motion In-Betweening with Phase Manifolds
  210. 209. Mode-Adaptive Neural Networks for Quadruped Motion Control
  211. 210. Few-shot Learning of Homogeneous Human Locomotion Styles
  212. 211. Learning predict-and-simulate policies from unorganized human motion data
  213. 212. Local Motion Phases for Learning Multi-Contact Character Movements
  214. 213. Interactive Control of Diverse Complex Characters with Neural Networks
  215. 214. Accelerated Auto-regressive Motion Diffusion Model
  216. 215. DARTControl: A Diffusion-based Autoregressive Motion Model for Real-time Text-driven Motion Control
  217. 216. Interactive Character Control with Auto-Regressive Motion Diffusion Models
  218. 217. Taming Diffusion Probabilistic Models for Character Control
  219. 218. Learned Motion Matching
  220. 219. MOCHA: Real-Time Motion Characterization via Context Matching
  221. 220. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning

ReadPapers

Flame: Free-form language-based motion synthesis & editing