Multi-task neural network with physical constraint for real-time multi-person 3D pose estimation from monocular camera

Dingli Luo, Songlin Du, Takeshi Ikenaga

Research output: Contribution to journalArticlepeer-review

Abstract

3D human pose estimation has many important applications in human-computer interaction and human action recognition. Simultaneously achieving real-time speed, varying human number, and high accuracy from a single RGB image is a challenging problem. To this end, this paper proposes a multi-task and multi-level neural network structure with physical constraint. The unique network structure estimates 3D human poses from single RGB image in an end-to-end way and achieves both high accuracy and high speed. Experimental results shows that the proposed system achieves 21 fps on RTX 2080 GPU with only 33 mm accuracy loss compared with conventional works. The mechanism of the network is also analyzed through network visualization. This work shows the possibility of estimating 3D human pose from a single RGB monocular camera with real-time speed.

Original languageEnglish
JournalMultimedia Tools and Applications
DOIs
Publication statusAccepted/In press - 2021

Keywords

  • 3D human pose estimation
  • Convolutional neural network
  • Multi-task learning
  • Real-time processing

ASJC Scopus subject areas

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Multi-task neural network with physical constraint for real-time multi-person 3D pose estimation from monocular camera'. Together they form a unique fingerprint.

Cite this