A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications

Jinjia Zhou, Dajiang Zhou, Hang Zhang, Yu Hong, Peilin Liu, Satoshi Goto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

In this paper, we present a high-throughput deblocking filter architecture for H.264/AVC in QFHD applications. In order to enhance the parallelism of filtering without notably increasing the area, we propose to parallelize the processing of luminance and chrominance samples, instead of simultaneously filtering two edges of a same component. Although the edge filter and transpose cost of the proposed architecture is a little larger than that of the single-filter solution, control logic is saved by applying an identical processing schedule to both the luminance and chrominance samples. Meanwhile, total SRAM size by bit is kept unchanged when the architecture is parallelized. As a result, throughput of this work is advanced by 50% (or processing time reduced by 33%), to be 136 cycles/MB, while area cost (17.9k gates logic and 8k bits SRAM) is kept comparable to the state-of-the-art works.

Original languageEnglish
Title of host publicationProceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009
Pages1134-1137
Number of pages4
DOIs
Publication statusPublished - 2009
Event2009 IEEE International Conference on Multimedia and Expo, ICME 2009 - New York, NY
Duration: 2009 Jun 282009 Jul 3

Other

Other2009 IEEE International Conference on Multimedia and Expo, ICME 2009
CityNew York, NY
Period09/6/2809/7/3

Fingerprint

Static random access storage
Luminance
Processing
Throughput
Logic gates
Costs

Keywords

  • Deblocking filter
  • H.264/AVC
  • Parallelism
  • QFHD

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Cite this

Zhou, J., Zhou, D., Zhang, H., Hong, Y., Liu, P., & Goto, S. (2009). A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications. In Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009 (pp. 1134-1137). [5202699] https://doi.org/10.1109/ICME.2009.5202699

A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications. / Zhou, Jinjia; Zhou, Dajiang; Zhang, Hang; Hong, Yu; Liu, Peilin; Goto, Satoshi.

Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009. 2009. p. 1134-1137 5202699.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhou, J, Zhou, D, Zhang, H, Hong, Y, Liu, P & Goto, S 2009, A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications. in Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009., 5202699, pp. 1134-1137, 2009 IEEE International Conference on Multimedia and Expo, ICME 2009, New York, NY, 09/6/28. https://doi.org/10.1109/ICME.2009.5202699
Zhou J, Zhou D, Zhang H, Hong Y, Liu P, Goto S. A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications. In Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009. 2009. p. 1134-1137. 5202699 https://doi.org/10.1109/ICME.2009.5202699
Zhou, Jinjia ; Zhou, Dajiang ; Zhang, Hang ; Hong, Yu ; Liu, Peilin ; Goto, Satoshi. / A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications. Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009. 2009. pp. 1134-1137
@inproceedings{0f4a147e44e3425085e5899a59b4fd5e,
title = "A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications",
abstract = "In this paper, we present a high-throughput deblocking filter architecture for H.264/AVC in QFHD applications. In order to enhance the parallelism of filtering without notably increasing the area, we propose to parallelize the processing of luminance and chrominance samples, instead of simultaneously filtering two edges of a same component. Although the edge filter and transpose cost of the proposed architecture is a little larger than that of the single-filter solution, control logic is saved by applying an identical processing schedule to both the luminance and chrominance samples. Meanwhile, total SRAM size by bit is kept unchanged when the architecture is parallelized. As a result, throughput of this work is advanced by 50{\%} (or processing time reduced by 33{\%}), to be 136 cycles/MB, while area cost (17.9k gates logic and 8k bits SRAM) is kept comparable to the state-of-the-art works.",
keywords = "Deblocking filter, H.264/AVC, Parallelism, QFHD",
author = "Jinjia Zhou and Dajiang Zhou and Hang Zhang and Yu Hong and Peilin Liu and Satoshi Goto",
year = "2009",
doi = "10.1109/ICME.2009.5202699",
language = "English",
isbn = "9781424442911",
pages = "1134--1137",
booktitle = "Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009",

}

TY - GEN

T1 - A 136 cycles/MB, luma-chroma parallelized H.264/AVC deblocking filter for QFHD applications

AU - Zhou, Jinjia

AU - Zhou, Dajiang

AU - Zhang, Hang

AU - Hong, Yu

AU - Liu, Peilin

AU - Goto, Satoshi

PY - 2009

Y1 - 2009

N2 - In this paper, we present a high-throughput deblocking filter architecture for H.264/AVC in QFHD applications. In order to enhance the parallelism of filtering without notably increasing the area, we propose to parallelize the processing of luminance and chrominance samples, instead of simultaneously filtering two edges of a same component. Although the edge filter and transpose cost of the proposed architecture is a little larger than that of the single-filter solution, control logic is saved by applying an identical processing schedule to both the luminance and chrominance samples. Meanwhile, total SRAM size by bit is kept unchanged when the architecture is parallelized. As a result, throughput of this work is advanced by 50% (or processing time reduced by 33%), to be 136 cycles/MB, while area cost (17.9k gates logic and 8k bits SRAM) is kept comparable to the state-of-the-art works.

AB - In this paper, we present a high-throughput deblocking filter architecture for H.264/AVC in QFHD applications. In order to enhance the parallelism of filtering without notably increasing the area, we propose to parallelize the processing of luminance and chrominance samples, instead of simultaneously filtering two edges of a same component. Although the edge filter and transpose cost of the proposed architecture is a little larger than that of the single-filter solution, control logic is saved by applying an identical processing schedule to both the luminance and chrominance samples. Meanwhile, total SRAM size by bit is kept unchanged when the architecture is parallelized. As a result, throughput of this work is advanced by 50% (or processing time reduced by 33%), to be 136 cycles/MB, while area cost (17.9k gates logic and 8k bits SRAM) is kept comparable to the state-of-the-art works.

KW - Deblocking filter

KW - H.264/AVC

KW - Parallelism

KW - QFHD

UR - http://www.scopus.com/inward/record.url?scp=70449598241&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70449598241&partnerID=8YFLogxK

U2 - 10.1109/ICME.2009.5202699

DO - 10.1109/ICME.2009.5202699

M3 - Conference contribution

AN - SCOPUS:70449598241

SN - 9781424442911

SP - 1134

EP - 1137

BT - Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009

ER -