Skip to content

Commit

Permalink
[Enhance] Pose3d inferencer supports image inputs (open-mmlab#2460)
Browse files Browse the repository at this point in the history
  • Loading branch information
Ben-Louis authored Jun 16, 2023
1 parent c5e9378 commit 96a35c9
Show file tree
Hide file tree
Showing 5 changed files with 187 additions and 81 deletions.
77 changes: 63 additions & 14 deletions docs/en/user_guides/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ from mmpose.apis import MMPoseInferencer

img_path = 'tests/data/coco/000000000785.jpg' # replace this with your own image path

# create the inferencer using the model alias
# instantiate the inferencer using the model alias
inferencer = MMPoseInferencer('human')

# The MMPoseInferencer API employs a lazy inference approach,
Expand All @@ -32,7 +32,46 @@ result = next(result_generator)
If everything works fine, you will see the following image in a new window:
![inferencer_result_coco](https://user-images.githubusercontent.com/26127467/220008302-4a57fd44-0978-408e-8351-600e5513316a.jpg)

The variable `result` is a dictionary that contains two keys, `'visualization'` and `'predictions'`. The `'visualization'` key is meant to store visualization results, but since the `return_vis` argument wasn't specified, this list remains empty. The `'predictions'` key, however, holds a list of estimated keypoints for each detected instance.
The `result` variable is a dictionary comprising two keys, `'visualization'` and `'predictions'`.

- `'visualization'` holds a list which:

- contains visualization results, such as the input image, markers of the estimated poses, and optional predicted heatmaps.
- remains empty if the `return_vis` argument is not specified.

- `'predictions'` stores:

- a list of estimated keypoints for each identified instance.

The structure of the `result` dictionary is as follows:

```python
result = {
'visualization': [
# number of elements: batch_size (defaults to 1)
vis_image_1,
...
],
'predictions': [
# pose estimation result of each image
# number of elements: batch_size (defaults to 1)
[
# pose information of each detected instance
# number of elements: number of detected instances
{'keypoints': ..., # instance 1
'keypoint_scores': ...,
...
},
{'keypoints': ..., # instance 2
'keypoint_scores': ...,
...
},
]
...
]
}

```

A **command-line interface (CLI)** tool for the inferencer is also available: `demo/inferencer_demo.py`. This tool allows users to perform inference using the same model and inputs with the following command:

Expand Down Expand Up @@ -175,24 +214,34 @@ The `MMPoseInferencer` offers a variety of arguments for customizing pose estima
| ---------------- | ---------------------------------------------------------------------------------------------------------------- |
| `pose2d` | Specifies the model alias, configuration file name, or configuration file path for the 2D pose estimation model. |
| `pose2d_weights` | Specifies the URL or local path to the 2D pose estimation model's checkpoint file. |
| `pose3d` | Specifies the model alias, configuration file name, or configuration file path for the 3D pose estimation model. |
| `pose3d_weights` | Specifies the URL or local path to the 3D pose estimation model's checkpoint file. |
| `det_model` | Specifies the model alias, configuration file name, or configuration file path for the object detection model. |
| `det_weights` | Specifies the URL or local path to the object detection model's checkpoint file. |
| `det_cat_ids` | Specifies the list of category IDs corresponding to the object classes to be detected. |
| `device` | The device to perform the inference. If left `None`, the Inferencer will select the most suitable one. |
| `scope` | The namespace where the model modules are defined. |

The inferencer is designed to handle both visualization and saving of predictions. Here is a list of arguments available when performing inference with the `MMPoseInferencer`:

| Argument | Description |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
| `show` | Determines whether the image or video should be displayed in a pop-up window. |
| `radius` | Sets the keypoint radius for visualization. |
| `thickness` | Sets the link thickness for visualization. |
| `return_vis` | Determines whether visualization images should be included in the results. |
| `vis_out_dir` | Specifies the folder path for saving the visualization images. If not set, the visualization images will not be saved. |
| `return_datasample` | Determines whether to return the prediction in the format of `PoseDataSample`. |
| `pred_out_dir` | Specifies the folder path for saving the predictions. If not set, the predictions will not be saved. |
| `out_dir` | If `vis_out_dir` or `pred_out_dir` is not set, the values will be set to `f'{out_dir}/visualization'` or `f'{out_dir}/predictions'`, respectively. |
The inferencer is designed for both visualization and saving predictions. The table below presents the list of arguments available when using the `MMPoseInferencer` for inference, along with their compatibility with 2D and 3D inferencing:

| Argument | Description | 2D | 3D |
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | --- | --- |
| `show` | Controls the display of the image or video in a pop-up window. | ✔️ | ✔️ |
| `radius` | Sets the visualization keypoint radius. | ✔️ | ✔️ |
| `thickness` | Determines the link thickness for visualization. | ✔️ | ✔️ |
| `kpt_thr` | Sets the keypoint score threshold. Keypoints with scores exceeding this threshold will be displayed. | ✔️ | ✔️ |
| `draw_bbox` | Decides whether to display the bounding boxes of instances. | ✔️ | ✔️ |
| `draw_heatmap` | Decides if the predicted heatmaps should be drawn. | ✔️ ||
| `black_background` | Decides whether the estimated poses should be displayed on a black background. | ✔️ ||
| `skeleton_style` | Sets the skeleton style. Options include 'mmpose' (default) and 'openpose'. | ✔️ ||
| `use_oks_tracking` | Decides whether to use OKS as a similarity measure in tracking. || ✔️ |
| `tracking_thr` | Sets the similarity threshold for tracking. || ✔️ |
| `norm_pose_2d` | Decides whether to scale the bounding box to the dataset's average bounding box scale and relocate the bounding box to the dataset's average bounding box center. || ✔️ |
| `return_vis` | Decides whether to include visualization images in the results. | ✔️ | ✔️ |
| `vis_out_dir` | Defines the folder path to save the visualization images. If unset, the visualization images will not be saved. | ✔️ | ✔️ |
| `return_datasample` | Determines if the prediction should be returned in the `PoseDataSample` format. | ✔️ | ✔️ |
| `pred_out_dir` | Specifies the folder path to save the predictions. If unset, the predictions will not be saved. | ✔️ | ✔️ |
| `out_dir` | If `vis_out_dir` or `pred_out_dir` is unset, these will be set to `f'{out_dir}/visualization'` or `f'{out_dir}/predictions'`, respectively. | ✔️ | ✔️ |

### Model Alias

Expand Down
93 changes: 69 additions & 24 deletions docs/zh_cn/user_guides/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,44 @@ result = next(result_generator)

![inferencer_result_coco](https://user-images.githubusercontent.com/26127467/220008302-4a57fd44-0978-408e-8351-600e5513316a.jpg)

在上述示例中,变量`result`是一个字典,包含两个键,分别是`visualization``predictions``visualization`用于存储可视化结果,但由于没有设定参数`return_vis`,因此该列表为空。但是`predictions`保存了每个检测到的实例的、估计得到的关键点列表
`result` 变量是一个包含两个键值 `'visualization'``'predictions'` 的字典

还可以使用用于用于推断的**命令行界面工具**(CLI, command-line interface):`demo/inferencer_demo.py`。这个工具允许用户使用以下命令使用相同的模型和输入执行推理:
- `'visualization'` 键对应的值是一个列表,该列表:
- 包含可视化结果,例如输入图像、估计姿态的标记,以及可选的预测热图。
- 如果没有指定 `return_vis` 参数,该列表将保持为空。
- `'predictions'` 键对应的值是:
- 一个包含每个检测实例的预估关键点的列表。

`result` 字典的结构如下所示:

```python
result = {
'visualization': [
# 元素数量:batch_size(默认为1)
vis_image_1,
...
],
'predictions': [
# 每张图像的姿态估计结果
# 元素数量:batch_size(默认为1)
[
# 每个检测到的实例的姿态信息
# 元素数量:检测到的实例数
{'keypoints': ..., # 实例 1
'keypoint_scores': ...,
...
},
{'keypoints': ..., # 实例 2
'keypoint_scores': ...,
...
},
]
...
]
}
```

还可以使用用于用于推断的**命令行界面工具**(CLI, command-line interface): `demo/inferencer_demo.py`。这个工具允许用户使用以下命令使用相同的模型和输入执行推理:

```python
python demo/inferencer_demo.py 'tests/data/coco/000000000785.jpg' \
Expand Down Expand Up @@ -163,28 +198,38 @@ result = next(result_generator)

`MMPoseInferencer`提供了各种自定义姿态估计、可视化和保存预测结果的参数。下面是<mark>初始化</mark>推断器时可用的参数列表及对这些参数的描述:

| Argument | Description |
| ---------------- | ---------------------------------------------------------- |
| `pose2d` | 指定2D姿态估计模型的模型别名、配置文件名称或配置文件路径。 |
| `pose2d_weights` | 指定2D姿态估计模型权重文件的URL或本地路径。 |
| `det_model` | 指定对象检测模型的模型别名、配置文件名或配置文件路径。 |
| `det_weights` | 指定对象检测模型权重文件的URL或本地路径。 |
| `det_cat_ids` | 指定与要检测的对象类对应的类别id列表。 |
| `device` | 执行推理的设备。如果为`None`,推理器将选择最合适的一个。 |
| `scope` | 定义模型模块的名称空间 |

推理器设计用于处理预测的可视化和保存。下面是使用`MMPoseInferencer`<mark>执行推理</mark>时可用的参数列表:

| Argument | Description |
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `show` | 确定图像或视频的预测结果是否应在弹出窗口中显示。 |
| `radius` | 设置关键点半径。 |
| `thickness` | 设置骨架(线条)粗细。 |
| `return_vis` | 确定返回结果`result`中是否应包括可视化结果列表`visualization`|
| `vis_out_dir` | 指定保存可视化图像的文件夹路径。如果未设置,将不会保存可视化图像。 |
| `return_datasample` | 确定是否以`PoseDataSample`的形式返回预测。 |
| `pred_out_dir` | 指定保存预测结果`predictions`的文件夹路径。如果不设置,预测结果将不会被保存。 |
| `out_dir` | 如果指定了输出路径参数`out_dir`,但未设置`vis_out_dir``pred_out_dir`,则分别将`vis_out_dir``pred_out_dir`设置为`f'{out_dir}/visualization'`` f'{out_dir}/ forecasts'`|
| Argument | Description |
| ---------------- | ------------------------------------------------------------ |
| `pose2d` | 指定 2D 姿态估计模型的模型别名、配置文件名称或配置文件路径。 |
| `pose2d_weights` | 指定 2D 姿态估计模型权重文件的URL或本地路径。 |
| `pose3d` | 指定 3D 姿态估计模型的模型别名、配置文件名称或配置文件路径。 |
| `pose3d_weights` | 指定 3D 姿态估计模型权重文件的URL或本地路径。 |
| `det_model` | 指定对象检测模型的模型别名、配置文件名或配置文件路径。 |
| `det_weights` | 指定对象检测模型权重文件的 URL 或本地路径。 |
| `det_cat_ids` | 指定与要检测的对象类对应的类别 id 列表。 |
| `device` | 执行推理的设备。如果为 `None`,推理器将选择最合适的一个。 |
| `scope` | 定义模型模块的名称空间 |

推理器被设计用于可视化和保存预测。以下表格列出了在使用 `MMPoseInferencer` <mark>进行推断</mark>时可用的参数列表,以及它们与 2D 和 3D 推理器的兼容性:

| 参数 | 描述 | 2D | 3D |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------- | --- | --- |
| `show` | 控制是否在弹出窗口中显示图像或视频。 | ✔️ | ✔️ |
| `radius` | 设置可视化关键点的半径。 | ✔️ | ✔️ |
| `thickness` | 确定可视化链接的厚度。 | ✔️ | ✔️ |
| `kpt_thr` | 设置关键点分数阈值。分数超过此阈值的关键点将被显示。 | ✔️ | ✔️ |
| `draw_bbox` | 决定是否显示实例的边界框。 | ✔️ | ✔️ |
| `draw_heatmap` | 决定是否绘制预测的热图。 | ✔️ ||
| `black_background` | 决定是否在黑色背景上显示预估的姿势。 | ✔️ ||
| `skeleton_style` | 设置骨架样式。可选项包括 'mmpose'(默认)和 'openpose'。 | ✔️ ||
| `use_oks_tracking` | 决定是否在追踪中使用OKS作为相似度测量。 || ✔️ |
| `tracking_thr` | 设置追踪的相似度阈值。 || ✔️ |
| `norm_pose_2d` | 决定是否将边界框缩放至数据集的平均边界框尺寸,并将边界框移至数据集的平均边界框中心。 || ✔️ |
| `return_vis` | 决定是否在结果中包含可视化图像。 | ✔️ | ✔️ |
| `vis_out_dir` | 定义保存可视化图像的文件夹路径。如果未设置,将不保存可视化图像。 | ✔️ | ✔️ |
| `return_datasample` | 决定是否以 `PoseDataSample` 格式返回预测。 | ✔️ | ✔️ |
| `pred_out_dir` | 指定保存预测的文件夹路径。如果未设置,将不保存预测。 | ✔️ | ✔️ |
| `out_dir` | 如果 `vis_out_dir``pred_out_dir` 未设置,它们将分别设置为 `f'{out_dir}/visualization'``f'{out_dir}/predictions'`| ✔️ | ✔️ |

### 模型别名

Expand Down
2 changes: 2 additions & 0 deletions mmpose/apis/inferencers/base_mmpose_inferencer.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ def _inputs_to_list(self, inputs: InputsType) -> Iterable:
fps=video.fps,
name=os.path.basename(inputs),
writer=None,
width=video.width,
height=video.height,
predictions=[])
inputs = video
elif input_type == 'image':
Expand Down
Loading

0 comments on commit 96a35c9

Please sign in to comment.