Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector

Tianruo Zhang, Chen Liu, Minghui Wang, Satoshi Goto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

Region-of-interest (ROI) can be applied in H.264 video encoder to enhance subjective quality and reduce computation complexity. For the aiming application of low cost hardware real-time encoder in videophone with faces as ROI, this paper proposes a face detection algorithm to detect each macroblock (MB) as one part of a face or not. This face detection algorithm has a unique estimation-and-verification process and can be combined with a H.264 encoder by MB level pipeline architecture. 97.91% MBs in faces can be detected. VLSI architecture of proposed face detection algorithm is designed and an area of 4.3k gates is achieved. Power consumption is only 1.45mW at 100MHz. A ROI based H.264 encoder with dynamic parameters is proposed to enhance subjective quality and reduce the rate-distortion-optimization (RDO) complexity. The PSNR in ROI increases for 4.8dB under similar bit rate. Encoding time is reduced to 54.4% in videophone-like sequences.

Original languageEnglish
Title of host publication2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09
DOIs
Publication statusPublished - 2009
Event2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09 - Rio De Janeiro
Duration: 2009 Oct 52009 Oct 7

Other

Other2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09
CityRio De Janeiro
Period09/10/509/10/7

Fingerprint

Face recognition
Detectors
Hardware
Electric power utilization
Pipelines
Costs

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Signal Processing

Cite this

Zhang, T., Liu, C., Wang, M., & Goto, S. (2009). Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector. In 2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09 [5293338] https://doi.org/10.1109/MMSP.2009.5293338

Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector. / Zhang, Tianruo; Liu, Chen; Wang, Minghui; Goto, Satoshi.

2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09. 2009. 5293338.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, T, Liu, C, Wang, M & Goto, S 2009, Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector. in 2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09., 5293338, 2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09, Rio De Janeiro, 09/10/5. https://doi.org/10.1109/MMSP.2009.5293338
Zhang T, Liu C, Wang M, Goto S. Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector. In 2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09. 2009. 5293338 https://doi.org/10.1109/MMSP.2009.5293338
Zhang, Tianruo ; Liu, Chen ; Wang, Minghui ; Goto, Satoshi. / Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector. 2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09. 2009.
@inproceedings{51040e7fcd9743a7bfbb48419d1e530e,
title = "Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector",
abstract = "Region-of-interest (ROI) can be applied in H.264 video encoder to enhance subjective quality and reduce computation complexity. For the aiming application of low cost hardware real-time encoder in videophone with faces as ROI, this paper proposes a face detection algorithm to detect each macroblock (MB) as one part of a face or not. This face detection algorithm has a unique estimation-and-verification process and can be combined with a H.264 encoder by MB level pipeline architecture. 97.91{\%} MBs in faces can be detected. VLSI architecture of proposed face detection algorithm is designed and an area of 4.3k gates is achieved. Power consumption is only 1.45mW at 100MHz. A ROI based H.264 encoder with dynamic parameters is proposed to enhance subjective quality and reduce the rate-distortion-optimization (RDO) complexity. The PSNR in ROI increases for 4.8dB under similar bit rate. Encoding time is reduced to 54.4{\%} in videophone-like sequences.",
author = "Tianruo Zhang and Chen Liu and Minghui Wang and Satoshi Goto",
year = "2009",
doi = "10.1109/MMSP.2009.5293338",
language = "English",
isbn = "9781424444649",
booktitle = "2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09",

}

TY - GEN

T1 - Region-of-interest based H.264 encoder for videophone with a hardware macroblock level face detector

AU - Zhang, Tianruo

AU - Liu, Chen

AU - Wang, Minghui

AU - Goto, Satoshi

PY - 2009

Y1 - 2009

N2 - Region-of-interest (ROI) can be applied in H.264 video encoder to enhance subjective quality and reduce computation complexity. For the aiming application of low cost hardware real-time encoder in videophone with faces as ROI, this paper proposes a face detection algorithm to detect each macroblock (MB) as one part of a face or not. This face detection algorithm has a unique estimation-and-verification process and can be combined with a H.264 encoder by MB level pipeline architecture. 97.91% MBs in faces can be detected. VLSI architecture of proposed face detection algorithm is designed and an area of 4.3k gates is achieved. Power consumption is only 1.45mW at 100MHz. A ROI based H.264 encoder with dynamic parameters is proposed to enhance subjective quality and reduce the rate-distortion-optimization (RDO) complexity. The PSNR in ROI increases for 4.8dB under similar bit rate. Encoding time is reduced to 54.4% in videophone-like sequences.

AB - Region-of-interest (ROI) can be applied in H.264 video encoder to enhance subjective quality and reduce computation complexity. For the aiming application of low cost hardware real-time encoder in videophone with faces as ROI, this paper proposes a face detection algorithm to detect each macroblock (MB) as one part of a face or not. This face detection algorithm has a unique estimation-and-verification process and can be combined with a H.264 encoder by MB level pipeline architecture. 97.91% MBs in faces can be detected. VLSI architecture of proposed face detection algorithm is designed and an area of 4.3k gates is achieved. Power consumption is only 1.45mW at 100MHz. A ROI based H.264 encoder with dynamic parameters is proposed to enhance subjective quality and reduce the rate-distortion-optimization (RDO) complexity. The PSNR in ROI increases for 4.8dB under similar bit rate. Encoding time is reduced to 54.4% in videophone-like sequences.

UR - http://www.scopus.com/inward/record.url?scp=74349083981&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=74349083981&partnerID=8YFLogxK

U2 - 10.1109/MMSP.2009.5293338

DO - 10.1109/MMSP.2009.5293338

M3 - Conference contribution

AN - SCOPUS:74349083981

SN - 9781424444649

BT - 2009 IEEE International Workshop on Multimedia Signal Processing, MMSP '09

ER -