Multi-task neural network with physical constraint for real-time multi-person 3D pose estimation from monocular camera

Dingli Luo, Songlin Du, Takeshi Ikenaga

研究成果: Article査読

抄録

3D human pose estimation has many important applications in human-computer interaction and human action recognition. Simultaneously achieving real-time speed, varying human number, and high accuracy from a single RGB image is a challenging problem. To this end, this paper proposes a multi-task and multi-level neural network structure with physical constraint. The unique network structure estimates 3D human poses from single RGB image in an end-to-end way and achieves both high accuracy and high speed. Experimental results shows that the proposed system achieves 21 fps on RTX 2080 GPU with only 33 mm accuracy loss compared with conventional works. The mechanism of the network is also analyzed through network visualization. This work shows the possibility of estimating 3D human pose from a single RGB monocular camera with real-time speed.

本文言語English
ページ(範囲)27223-27244
ページ数22
ジャーナルMultimedia Tools and Applications
80
18
DOI
出版ステータスPublished - 2021 7

ASJC Scopus subject areas

  • ソフトウェア
  • メディア記述
  • ハードウェアとアーキテクチャ
  • コンピュータ ネットワークおよび通信

フィンガープリント

「Multi-task neural network with physical constraint for real-time multi-person 3D pose estimation from monocular camera」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル