A 48 cycles/MB H.264/AVC deblocking filter architecture for ultra high definition applications

Dajiang Zhou, Jinjia Zhou, Jiayi Zhu, Satoshi Goto

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60 fps sequences at less than 100 MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buff er. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24x64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130 nm process, the architecture costs a gate count of 30.2 k, which is competitive considering its high performance.

Original languageEnglish
Pages (from-to)3203-3210
Number of pages8
JournalIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
VolumeE92-A
Issue number12
DOIs
Publication statusPublished - 2009 Dec
Externally publishedYes

Fingerprint

Filter
Cycle
Processing
Schedule
Zigzag
Pipelines
Throughput
Buffer
Static random access storage
Internal
Data Dependency
Merging
Reordering
Costs
Clocks
Hazards
Data Flow
Hazard
Bubble
Parallelism

Keywords

  • Deblocking
  • H.264/AVC
  • Parallel
  • QFHD
  • Ultra high resolution

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design
  • Applied Mathematics
  • Signal Processing

Cite this

A 48 cycles/MB H.264/AVC deblocking filter architecture for ultra high definition applications. / Zhou, Dajiang; Zhou, Jinjia; Zhu, Jiayi; Goto, Satoshi.

In: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E92-A, No. 12, 12.2009, p. 3203-3210.

Research output: Contribution to journalArticle

@article{6f04855fc0fe41299c9effdec204bb21,
title = "A 48 cycles/MB H.264/AVC deblocking filter architecture for ultra high definition applications",
abstract = "In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60 fps sequences at less than 100 MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buff er. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24x64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130 nm process, the architecture costs a gate count of 30.2 k, which is competitive considering its high performance.",
keywords = "Deblocking, H.264/AVC, Parallel, QFHD, Ultra high resolution",
author = "Dajiang Zhou and Jinjia Zhou and Jiayi Zhu and Satoshi Goto",
year = "2009",
month = "12",
doi = "10.1587/transfun.E92.A.3203",
language = "English",
volume = "E92-A",
pages = "3203--3210",
journal = "IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences",
issn = "0916-8508",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "12",

}

TY - JOUR

T1 - A 48 cycles/MB H.264/AVC deblocking filter architecture for ultra high definition applications

AU - Zhou, Dajiang

AU - Zhou, Jinjia

AU - Zhu, Jiayi

AU - Goto, Satoshi

PY - 2009/12

Y1 - 2009/12

N2 - In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60 fps sequences at less than 100 MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buff er. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24x64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130 nm process, the architecture costs a gate count of 30.2 k, which is competitive considering its high performance.

AB - In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60 fps sequences at less than 100 MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buff er. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24x64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130 nm process, the architecture costs a gate count of 30.2 k, which is competitive considering its high performance.

KW - Deblocking

KW - H.264/AVC

KW - Parallel

KW - QFHD

KW - Ultra high resolution

UR - http://www.scopus.com/inward/record.url?scp=84857572417&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857572417&partnerID=8YFLogxK

U2 - 10.1587/transfun.E92.A.3203

DO - 10.1587/transfun.E92.A.3203

M3 - Article

VL - E92-A

SP - 3203

EP - 3210

JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

SN - 0916-8508

IS - 12

ER -