Efficient monocular pose estimation for complex 3D models

A. Rubio, M. Villamizar, L. Ferraz, A. Penate-Sanchez, A. Ramisa, Edgar Simo Serra, A. Sanfeliu, F. Moreno-Noguer

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

We propose a robust and efficient method to estimate the pose of a camera with respect to complex 3D textured models of the environment that can potentially contain more than 100; 000 points. To tackle this problem we follow a top down approach where we combine high-level deep network classifiers with low level geometric approaches to come up with a solution that is fast, robust and accurate. Given an input image, we initially use a pre-trained deep network to compute a rough estimation of the camera pose. This initial estimate constrains the number of 3D model points that can be seen from the camera viewpoint. We then establish 3D-to-2D correspondences between these potentially visible points of the model and the 2D detected image features. Accurate pose estimation is finally obtained from the 2D-to-3D correspondences using a novel PnP algorithm that rejects outliers without the need to use a RANSAC strategy, and which is between 10 and 100 times faster than other methods that use it. Two real experiments dealing with very large and complex 3D models demonstrate the effectiveness of the approach.

Original languageEnglish
Article number7139372
Pages (from-to)1397-1402
Number of pages6
JournalUnknown Journal
Volume2015-June
Issue numberJune
DOIs
Publication statusPublished - 2015 Jun 29
Externally publishedYes

Fingerprint

Cameras
cameras
estimates
classifiers
Classifiers
Experiments

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Rubio, A., Villamizar, M., Ferraz, L., Penate-Sanchez, A., Ramisa, A., Simo Serra, E., ... Moreno-Noguer, F. (2015). Efficient monocular pose estimation for complex 3D models. Unknown Journal, 2015-June(June), 1397-1402. [7139372]. https://doi.org/10.1109/ICRA.2015.7139372

Efficient monocular pose estimation for complex 3D models. / Rubio, A.; Villamizar, M.; Ferraz, L.; Penate-Sanchez, A.; Ramisa, A.; Simo Serra, Edgar; Sanfeliu, A.; Moreno-Noguer, F.

In: Unknown Journal, Vol. 2015-June, No. June, 7139372, 29.06.2015, p. 1397-1402.

Research output: Contribution to journalArticle

Rubio, A, Villamizar, M, Ferraz, L, Penate-Sanchez, A, Ramisa, A, Simo Serra, E, Sanfeliu, A & Moreno-Noguer, F 2015, 'Efficient monocular pose estimation for complex 3D models', Unknown Journal, vol. 2015-June, no. June, 7139372, pp. 1397-1402. https://doi.org/10.1109/ICRA.2015.7139372
Rubio A, Villamizar M, Ferraz L, Penate-Sanchez A, Ramisa A, Simo Serra E et al. Efficient monocular pose estimation for complex 3D models. Unknown Journal. 2015 Jun 29;2015-June(June):1397-1402. 7139372. https://doi.org/10.1109/ICRA.2015.7139372
Rubio, A. ; Villamizar, M. ; Ferraz, L. ; Penate-Sanchez, A. ; Ramisa, A. ; Simo Serra, Edgar ; Sanfeliu, A. ; Moreno-Noguer, F. / Efficient monocular pose estimation for complex 3D models. In: Unknown Journal. 2015 ; Vol. 2015-June, No. June. pp. 1397-1402.
@article{d99c1195e50c463085285a71213ede64,
title = "Efficient monocular pose estimation for complex 3D models",
abstract = "We propose a robust and efficient method to estimate the pose of a camera with respect to complex 3D textured models of the environment that can potentially contain more than 100; 000 points. To tackle this problem we follow a top down approach where we combine high-level deep network classifiers with low level geometric approaches to come up with a solution that is fast, robust and accurate. Given an input image, we initially use a pre-trained deep network to compute a rough estimation of the camera pose. This initial estimate constrains the number of 3D model points that can be seen from the camera viewpoint. We then establish 3D-to-2D correspondences between these potentially visible points of the model and the 2D detected image features. Accurate pose estimation is finally obtained from the 2D-to-3D correspondences using a novel PnP algorithm that rejects outliers without the need to use a RANSAC strategy, and which is between 10 and 100 times faster than other methods that use it. Two real experiments dealing with very large and complex 3D models demonstrate the effectiveness of the approach.",
author = "A. Rubio and M. Villamizar and L. Ferraz and A. Penate-Sanchez and A. Ramisa and {Simo Serra}, Edgar and A. Sanfeliu and F. Moreno-Noguer",
year = "2015",
month = "6",
day = "29",
doi = "10.1109/ICRA.2015.7139372",
language = "English",
volume = "2015-June",
pages = "1397--1402",
journal = "Nuclear Physics A",
issn = "0375-9474",
publisher = "Elsevier",
number = "June",

}

TY - JOUR

T1 - Efficient monocular pose estimation for complex 3D models

AU - Rubio, A.

AU - Villamizar, M.

AU - Ferraz, L.

AU - Penate-Sanchez, A.

AU - Ramisa, A.

AU - Simo Serra, Edgar

AU - Sanfeliu, A.

AU - Moreno-Noguer, F.

PY - 2015/6/29

Y1 - 2015/6/29

N2 - We propose a robust and efficient method to estimate the pose of a camera with respect to complex 3D textured models of the environment that can potentially contain more than 100; 000 points. To tackle this problem we follow a top down approach where we combine high-level deep network classifiers with low level geometric approaches to come up with a solution that is fast, robust and accurate. Given an input image, we initially use a pre-trained deep network to compute a rough estimation of the camera pose. This initial estimate constrains the number of 3D model points that can be seen from the camera viewpoint. We then establish 3D-to-2D correspondences between these potentially visible points of the model and the 2D detected image features. Accurate pose estimation is finally obtained from the 2D-to-3D correspondences using a novel PnP algorithm that rejects outliers without the need to use a RANSAC strategy, and which is between 10 and 100 times faster than other methods that use it. Two real experiments dealing with very large and complex 3D models demonstrate the effectiveness of the approach.

AB - We propose a robust and efficient method to estimate the pose of a camera with respect to complex 3D textured models of the environment that can potentially contain more than 100; 000 points. To tackle this problem we follow a top down approach where we combine high-level deep network classifiers with low level geometric approaches to come up with a solution that is fast, robust and accurate. Given an input image, we initially use a pre-trained deep network to compute a rough estimation of the camera pose. This initial estimate constrains the number of 3D model points that can be seen from the camera viewpoint. We then establish 3D-to-2D correspondences between these potentially visible points of the model and the 2D detected image features. Accurate pose estimation is finally obtained from the 2D-to-3D correspondences using a novel PnP algorithm that rejects outliers without the need to use a RANSAC strategy, and which is between 10 and 100 times faster than other methods that use it. Two real experiments dealing with very large and complex 3D models demonstrate the effectiveness of the approach.

UR - http://www.scopus.com/inward/record.url?scp=84938262734&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84938262734&partnerID=8YFLogxK

U2 - 10.1109/ICRA.2015.7139372

DO - 10.1109/ICRA.2015.7139372

M3 - Article

AN - SCOPUS:84938262734

VL - 2015-June

SP - 1397

EP - 1402

JO - Nuclear Physics A

JF - Nuclear Physics A

SN - 0375-9474

IS - June

M1 - 7139372

ER -