| ID | Year | Name | Note | Tags | Link |
|---|---|---|---|---|---|
| 2024 | PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling | 一个以人为中心的多功能数据集,用于从密集的多视图视频中高保真重建和渲染动态人类场景。超过 56 个同步摄像机, 45 个不同场景, 32 不同的人,820万帧。每帧都有高度详细的外观和逼真的人体动作 | |||
| 2023 | BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion | ||||
| 2023 | CIRCLE: Capture In Rich Contextual Environments | 具有目标导向运动的数据集 | |||
| 2022 | Artemis: Articulated Neural Pets with Appearance and Motion Synthesis | 动态毛茸茸动物(DFA)数据集: - 来自艺术家的建模。 - 含九种高质量的 CGI 动物,包括熊猫、狮子、猫等。 - 它们具有基于纤维/线的毛皮和骨骼 - 使用商业渲染引擎(例如 MAYA)将所有这些 CGI 动物角色渲染成各种代表性骨骼运动下的高质量多视图 1080 × 1080 RGBA 视频。具体来说,我们采用了 36 个摄像机视图,这些摄像机视图均匀地围绕捕获的动物排列成一个圆圈,每个动物的代表性姿势数量从 700 到 1000 个不等。 | 四足动物 | 论文,数据集 | |
| 2019 | AMASS: Archive of Motion Capture as Surface Shapes | AMASS数据集构成了一个全面且多样化的人体运动数据集,包含来自300名受试者的11,000多个动作,总计超过40个小时。 运动数据以及用于骨架和网格表示的 SMPL 参数源自利用 15 个光学标记的基于标记的 MoCap 系统。 | |||
| 2019 | iMapper | i3DB [69] contains RGB videos of person-scene interactions involving medium to heavy occlusions. It provides annotated 3D joint positions and a primitive 3D scene reconstruction. | |||
| 2019 | Resolving 3D Human Pose Ambiguities With 3D Scene Constraints | PROX [34] contains RGB-D videos of people interacting with indoor environments. | |||
| 2018 | Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera | 3DPW 数据集捕获 51,000 个单视图int the wild视频序列,并由 IMU 数据补充。 这些视频是使用手持式摄像机录制的,IMU 数据有助于将 2D 姿势与其 3D 对应姿势关联起来。 3DPW 是最强大的数据集之一,将自身确立为近期多人野外场景中 3D 姿态估计的基准。 | |||
| 2014 | Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments | 使用 RGB 和 ToF 相机从现实世界环境中的不同视角捕获的 360 万个姿势的大量集合。 身体网格的高分辨率 3D 扫描仪数据。 | |||
| 225 | MPI-INF-3DPH | 超过 2K 的视频,具有户外场景中 13 个关键点的联合注释,适用于 2D 和 3D 人体姿势估计。 GT是通过多摄像头布置和无标记动捕系统获得的,这代表了与涉及真实个体的传统基于标记的动捕系统的转变。 | |||
| 226 | HumanEva dataset | 多视图 3D 人体姿态估计数据集。包括两个版本:HumanEva-I 和 HumanEva-II。 在 HumanEva-I 中,数据集包括从位于前、左、右 (RGB) 和四个角 (Mono) 的七个摄像头捕获的约 40,000 个多视图视频帧。 HumanEva-II 具有大约 2,460 帧,由每个角落的四个摄像机记录。 | |||
| 227,248 | CMU-Panoptic dataset | 65 个帧序列,大约 5.5 小时的镜头,并具有 150 万个 3D 带注释的姿势。 该数据集通过配备 511 个校准相机和 10 个具有基于硬件同步功能的 RGB-D 传感器的大型多视图系统记录,对于通过多视图几何开发弱监督方法至关重要。 这些方法解决了传统计算机视觉技术中常见的遮挡问题。 | |||
| 115 | Multiperson Composited 3D Human Pose (MuCo-3DHP) dataset | 用作 3D 人体姿态估计的大规模多人遮挡训练集。 MuCo-3DHP 中的帧是通过合成和增强方案从 MPI-INF-3DPH 数据集生成的。 | |||
| SURREAL dataset [228] is a large synthetic human body dataset containing 6 million RGB video frames. It provides a range of accurate annotations, including depth, body parts, optical flow, 2D/3D poses, and surfaces. In the SURREAL dataset, images exhibit variations in texture, view, and pose, and the body models are based on the SMPL parameters, a widely-recognized mesh representation standard. | |||||
| 3DOH50K dataset [150] offers a collection of 51,600 images obtained from six distinct viewpoints in real-world settings, predominantly featuring object oc- clusions. Each image is annotated with ground truth 2D and 3D poses, SMPL parameters, and a segmentation mask. Utilized for training human estimation and reconstruction models, the 3DOH50K dataset facilitates exceptional per- formance in occlusion scenarios. | |||||
| 3DCP dataset [229] represents a 3D human mesh dataset, derived from AMASS [230]. It includes 190 self-contact meshes spanning six human subjects (three males and three females), each modeled with an SMPL-X parameterized template. | |||||
| DensePose dataset [231] features 50,000 manually annotated real images, comprising 5 million image-to-surface correspondence pairs extracted from the COCO [249] dataset. This dataset proves instrumental for training in dense human pose estimation, as well as in detection and segmentation tasks. | |||||
| UP-3D dataset [232] is a dedicated 3D human pose and shape estima- tion dataset featuring extensive annotations in sports scenarios. The UP-3D comprises approximately 8,000 images from the LSP and MPII datasets. Addi- tionally, each image in UP-3D is accompanied by a metadata file indicating the quality (medium or high) of the 3D fit. | |||||
| THuman dataset [233] constitutes a 3D real-world human mesh dataset. It includes 7,000 RGBD images, each featuring a textured surface mesh obtained using a Kinect camera. Including surface mesh with detailed texture and the aligned SMPL model is anticipated to significantly enhance and stimulate future research in human mesh reconstruction. | |||||