Architecture and circuit optimization of hardwired integer motion estimation engine for H.264/AVC

Zhenyu Liu, Dongsheng Wang, Takeshi Ikenaga

Research output: Contribution to journalArticle

Abstract

Variable block size motion estimation developed by the latest video coding standard H.264/AVC is the efficient approach to reduce the temporal redundancies. The intensive computational complexity coming from the variable block size technique makes the hardwired accelerator essential, for real-time applications. Propagate partial sums of absolute differences (Propagate Partial SAD) and SAD Tree hardwired engines outperform other counterparts, especially considering the impact of supporting variable block size technique. In this paper, the authors apply the architecture-level and the circuit-level approaches to improve the maximum operating frequency and reduce the hardware overhead of Propagate Partial SAD and SAD Tree, while other metrics, in terms of latency, memory bandwidth and hardware utilization, of the original architectures are maintained. Experiments demonstrate that by using the proposed approaches, at 110.8MHz operating frequency, compared with the original architectures, 14.7% and 18.0% gate count can be saved for Propagate Partial SAD and SAD Tree, respectively. With TSMC 0.18 μm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture achieves 231.6MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the maximum work frequency of the optimized SAD Tree architecture is improved to 204.8MHz, which is almost two times of the original one, while its hardware overhead is merely 88.5 k-gate.

Original languageEnglish
Pages (from-to)2065-2073
Number of pages9
JournalIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
VolumeE93-A
Issue number11
DOIs
Publication statusPublished - 2010 Nov

Fingerprint

Motion Estimation
Motion estimation
Partial Sums
Engine
Engines
Hardware
Integer
Optimization
Networks (circuits)
Image coding
Computer hardware
Particle accelerators
Redundancy
Computational complexity
Bandwidth
Data storage equipment
Video Coding
Costs
Accelerator
Experiments

Keywords

  • H.264/AVC
  • Hardwired engine
  • Variable block size motion estimation
  • Very large scale integration (VLSI)

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design
  • Applied Mathematics
  • Signal Processing

Cite this

@article{4bb2ef8d63ae447d9545b43b407921cd,
title = "Architecture and circuit optimization of hardwired integer motion estimation engine for H.264/AVC",
abstract = "Variable block size motion estimation developed by the latest video coding standard H.264/AVC is the efficient approach to reduce the temporal redundancies. The intensive computational complexity coming from the variable block size technique makes the hardwired accelerator essential, for real-time applications. Propagate partial sums of absolute differences (Propagate Partial SAD) and SAD Tree hardwired engines outperform other counterparts, especially considering the impact of supporting variable block size technique. In this paper, the authors apply the architecture-level and the circuit-level approaches to improve the maximum operating frequency and reduce the hardware overhead of Propagate Partial SAD and SAD Tree, while other metrics, in terms of latency, memory bandwidth and hardware utilization, of the original architectures are maintained. Experiments demonstrate that by using the proposed approaches, at 110.8MHz operating frequency, compared with the original architectures, 14.7{\%} and 18.0{\%} gate count can be saved for Propagate Partial SAD and SAD Tree, respectively. With TSMC 0.18 μm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture achieves 231.6MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the maximum work frequency of the optimized SAD Tree architecture is improved to 204.8MHz, which is almost two times of the original one, while its hardware overhead is merely 88.5 k-gate.",
keywords = "H.264/AVC, Hardwired engine, Variable block size motion estimation, Very large scale integration (VLSI)",
author = "Zhenyu Liu and Dongsheng Wang and Takeshi Ikenaga",
year = "2010",
month = "11",
doi = "10.1587/transfun.E93.A.2065",
language = "English",
volume = "E93-A",
pages = "2065--2073",
journal = "IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences",
issn = "0916-8508",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "11",

}

TY - JOUR

T1 - Architecture and circuit optimization of hardwired integer motion estimation engine for H.264/AVC

AU - Liu, Zhenyu

AU - Wang, Dongsheng

AU - Ikenaga, Takeshi

PY - 2010/11

Y1 - 2010/11

N2 - Variable block size motion estimation developed by the latest video coding standard H.264/AVC is the efficient approach to reduce the temporal redundancies. The intensive computational complexity coming from the variable block size technique makes the hardwired accelerator essential, for real-time applications. Propagate partial sums of absolute differences (Propagate Partial SAD) and SAD Tree hardwired engines outperform other counterparts, especially considering the impact of supporting variable block size technique. In this paper, the authors apply the architecture-level and the circuit-level approaches to improve the maximum operating frequency and reduce the hardware overhead of Propagate Partial SAD and SAD Tree, while other metrics, in terms of latency, memory bandwidth and hardware utilization, of the original architectures are maintained. Experiments demonstrate that by using the proposed approaches, at 110.8MHz operating frequency, compared with the original architectures, 14.7% and 18.0% gate count can be saved for Propagate Partial SAD and SAD Tree, respectively. With TSMC 0.18 μm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture achieves 231.6MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the maximum work frequency of the optimized SAD Tree architecture is improved to 204.8MHz, which is almost two times of the original one, while its hardware overhead is merely 88.5 k-gate.

AB - Variable block size motion estimation developed by the latest video coding standard H.264/AVC is the efficient approach to reduce the temporal redundancies. The intensive computational complexity coming from the variable block size technique makes the hardwired accelerator essential, for real-time applications. Propagate partial sums of absolute differences (Propagate Partial SAD) and SAD Tree hardwired engines outperform other counterparts, especially considering the impact of supporting variable block size technique. In this paper, the authors apply the architecture-level and the circuit-level approaches to improve the maximum operating frequency and reduce the hardware overhead of Propagate Partial SAD and SAD Tree, while other metrics, in terms of latency, memory bandwidth and hardware utilization, of the original architectures are maintained. Experiments demonstrate that by using the proposed approaches, at 110.8MHz operating frequency, compared with the original architectures, 14.7% and 18.0% gate count can be saved for Propagate Partial SAD and SAD Tree, respectively. With TSMC 0.18 μm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture achieves 231.6MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the maximum work frequency of the optimized SAD Tree architecture is improved to 204.8MHz, which is almost two times of the original one, while its hardware overhead is merely 88.5 k-gate.

KW - H.264/AVC

KW - Hardwired engine

KW - Variable block size motion estimation

KW - Very large scale integration (VLSI)

UR - http://www.scopus.com/inward/record.url?scp=78049508947&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78049508947&partnerID=8YFLogxK

U2 - 10.1587/transfun.E93.A.2065

DO - 10.1587/transfun.E93.A.2065

M3 - Article

AN - SCOPUS:78049508947

VL - E93-A

SP - 2065

EP - 2073

JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

SN - 0916-8508

IS - 11

ER -