A Dual-Clock VLSI Design of H.265 Sample Adaptive Offset Estimation for 8k Ultra-HD TV Encoding

Jianbin Zhou, Dajiang Zhou, Shihao Wang, Shuping Zhang, Takeshi Yoshimura, Satoshi Goto

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Sample adaptive offset (SAO) is a newly introduced in-loop filtering component in H.265/High Efficiency Video Coding (HEVC). While SAO contributes to a notable coding efficiency improvement, the estimation of SAO parameters dominates the complexity of in-loop filtering in HEVC encoding. This paper presents an efficient VLSI design for SAO estimation. Our design features a dual-clock architecture that processes statistics collection (SC) and parameter decision (PD), the two main functional blocks of SAO estimation, at high- and low-speed clocks, respectively. Such a strategy reduces the overall area by 56% by addressing the heterogeneous data flows of SC and PD. To further improve the area and power efficiency, algorithm-architecture co-optimizations are applied, including a coarse range selection (CRS) and an accumulator bit width reduction (ABR). CRS shrinks the range of fine processed bands for the band offset estimation. ABR further reduces the area by narrowing the accumulators of SC. They together achieve another 25% area reduction. The proposed VLSI design is capable of processing 8k at 120-frames/s encoding. It occupies 51k logic gates, only one-third of the circuit area of the state-of-the-art implementations.

Original languageEnglish
JournalIEEE Transactions on Very Large Scale Integration (VLSI) Systems
DOIs
Publication statusAccepted/In press - 2016 Aug 9

Fingerprint

Clocks
Statistics
Image coding
Logic gates
Networks (circuits)
Processing

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

A Dual-Clock VLSI Design of H.265 Sample Adaptive Offset Estimation for 8k Ultra-HD TV Encoding. / Zhou, Jianbin; Zhou, Dajiang; Wang, Shihao; Zhang, Shuping; Yoshimura, Takeshi; Goto, Satoshi.

In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 09.08.2016.

Research output: Contribution to journalArticle

Zhou, Jianbin ; Zhou, Dajiang ; Wang, Shihao ; Zhang, Shuping ; Yoshimura, Takeshi ; Goto, Satoshi. / A Dual-Clock VLSI Design of H.265 Sample Adaptive Offset Estimation for 8k Ultra-HD TV Encoding. In: IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2016.
@article{689af41b30eb4cd488b59f7b061d3713,
title = "A Dual-Clock VLSI Design of H.265 Sample Adaptive Offset Estimation for 8k Ultra-HD TV Encoding",
abstract = "Sample adaptive offset (SAO) is a newly introduced in-loop filtering component in H.265/High Efficiency Video Coding (HEVC). While SAO contributes to a notable coding efficiency improvement, the estimation of SAO parameters dominates the complexity of in-loop filtering in HEVC encoding. This paper presents an efficient VLSI design for SAO estimation. Our design features a dual-clock architecture that processes statistics collection (SC) and parameter decision (PD), the two main functional blocks of SAO estimation, at high- and low-speed clocks, respectively. Such a strategy reduces the overall area by 56{\%} by addressing the heterogeneous data flows of SC and PD. To further improve the area and power efficiency, algorithm-architecture co-optimizations are applied, including a coarse range selection (CRS) and an accumulator bit width reduction (ABR). CRS shrinks the range of fine processed bands for the band offset estimation. ABR further reduces the area by narrowing the accumulators of SC. They together achieve another 25{\%} area reduction. The proposed VLSI design is capable of processing 8k at 120-frames/s encoding. It occupies 51k logic gates, only one-third of the circuit area of the state-of-the-art implementations.",
author = "Jianbin Zhou and Dajiang Zhou and Shihao Wang and Shuping Zhang and Takeshi Yoshimura and Satoshi Goto",
year = "2016",
month = "8",
day = "9",
doi = "10.1109/TVLSI.2016.2593581",
language = "English",
journal = "IEEE Transactions on Very Large Scale Integration (VLSI) Systems",
issn = "1063-8210",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A Dual-Clock VLSI Design of H.265 Sample Adaptive Offset Estimation for 8k Ultra-HD TV Encoding

AU - Zhou, Jianbin

AU - Zhou, Dajiang

AU - Wang, Shihao

AU - Zhang, Shuping

AU - Yoshimura, Takeshi

AU - Goto, Satoshi

PY - 2016/8/9

Y1 - 2016/8/9

N2 - Sample adaptive offset (SAO) is a newly introduced in-loop filtering component in H.265/High Efficiency Video Coding (HEVC). While SAO contributes to a notable coding efficiency improvement, the estimation of SAO parameters dominates the complexity of in-loop filtering in HEVC encoding. This paper presents an efficient VLSI design for SAO estimation. Our design features a dual-clock architecture that processes statistics collection (SC) and parameter decision (PD), the two main functional blocks of SAO estimation, at high- and low-speed clocks, respectively. Such a strategy reduces the overall area by 56% by addressing the heterogeneous data flows of SC and PD. To further improve the area and power efficiency, algorithm-architecture co-optimizations are applied, including a coarse range selection (CRS) and an accumulator bit width reduction (ABR). CRS shrinks the range of fine processed bands for the band offset estimation. ABR further reduces the area by narrowing the accumulators of SC. They together achieve another 25% area reduction. The proposed VLSI design is capable of processing 8k at 120-frames/s encoding. It occupies 51k logic gates, only one-third of the circuit area of the state-of-the-art implementations.

AB - Sample adaptive offset (SAO) is a newly introduced in-loop filtering component in H.265/High Efficiency Video Coding (HEVC). While SAO contributes to a notable coding efficiency improvement, the estimation of SAO parameters dominates the complexity of in-loop filtering in HEVC encoding. This paper presents an efficient VLSI design for SAO estimation. Our design features a dual-clock architecture that processes statistics collection (SC) and parameter decision (PD), the two main functional blocks of SAO estimation, at high- and low-speed clocks, respectively. Such a strategy reduces the overall area by 56% by addressing the heterogeneous data flows of SC and PD. To further improve the area and power efficiency, algorithm-architecture co-optimizations are applied, including a coarse range selection (CRS) and an accumulator bit width reduction (ABR). CRS shrinks the range of fine processed bands for the band offset estimation. ABR further reduces the area by narrowing the accumulators of SC. They together achieve another 25% area reduction. The proposed VLSI design is capable of processing 8k at 120-frames/s encoding. It occupies 51k logic gates, only one-third of the circuit area of the state-of-the-art implementations.

UR - http://www.scopus.com/inward/record.url?scp=84981717734&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84981717734&partnerID=8YFLogxK

U2 - 10.1109/TVLSI.2016.2593581

DO - 10.1109/TVLSI.2016.2593581

M3 - Article

AN - SCOPUS:84981717734

JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

SN - 1063-8210

ER -