High performance VLSI Architecture of H.265/HEVC intra prediction for 8K UHDTV video decoder

Jianbin Zhou, Dajiang Zhou, Shihao Wang, Takeshi Yoshimura, Satoshi Goto

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

8K Ultra High Definition Television (UHDTV) requires extremely high throughput for video decoding based on H.265. In H.265, intra coding could significantly enhance video compression efficiency, at the expense of an increased computational complexity compared with H.264. For intra prediction of 8K UHDTV real-time H.265 decoding, the joint complexity and throughput issue is more difficult to solve. Therefore, based on the divide-and-conquer strategy, we propose a new VLSI architecture in this paper, including two techniques, in order to achieve 8K UHDTV H.265 intra prediction decoding. The first technique is the LUT based Reference Sample Fetching Scheme (LUT-RSFS), reducing the number of reference samples in the worst case from 99 to 13. It further reduces the circuit area and enhances the performance. The second one is the Hybrid Block Reordering and Data Forwarding (HBRDF), minimizing the idle time and eliminating the dependency between TUs by creating 3 Data Forwarding paths. It achieves the hardware utilization of 94%. Our design is synthesized using Synopsys Design Compiler in 40 nm process technology. It achieves an operation frequency of 260 MHz, with a gate count of 217.8K for 8-bit design, and 251.1K for 10-bit design. The proposed VLSI architecture can support 4320p@120 fps H.265 intra decoding (8-bit or 10-bit), with all 35 intra prediction modes and prediction unit sizes ranging from 4 × 4 to 64 × 64.

Original languageEnglish
Pages (from-to)2519-2527
Number of pages9
JournalIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
VolumeE98A
Issue number12
DOIs
Publication statusPublished - 2015 Dec 1

Fingerprint

Intra Prediction
VLSI Architecture
High definition television
Decoding
High Performance
Throughput
Video Compression
Reordering
Divide and conquer
Image compression
Compiler
High Throughput
Computational complexity
Count
Computational Complexity
Coding
Hardware
Real-time
Path
Unit

Keywords

  • 8K UHDTV
  • HEVC/H.265 decoder
  • Intra prediction
  • VLSI architecture

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design
  • Applied Mathematics
  • Signal Processing

Cite this

High performance VLSI Architecture of H.265/HEVC intra prediction for 8K UHDTV video decoder. / Zhou, Jianbin; Zhou, Dajiang; Wang, Shihao; Yoshimura, Takeshi; Goto, Satoshi.

In: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E98A, No. 12, 01.12.2015, p. 2519-2527.

Research output: Contribution to journalArticle

Zhou, Jianbin ; Zhou, Dajiang ; Wang, Shihao ; Yoshimura, Takeshi ; Goto, Satoshi. / High performance VLSI Architecture of H.265/HEVC intra prediction for 8K UHDTV video decoder. In: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences. 2015 ; Vol. E98A, No. 12. pp. 2519-2527.
@article{a53c13a2dc0c4a9399070834afe2f71d,
title = "High performance VLSI Architecture of H.265/HEVC intra prediction for 8K UHDTV video decoder",
abstract = "8K Ultra High Definition Television (UHDTV) requires extremely high throughput for video decoding based on H.265. In H.265, intra coding could significantly enhance video compression efficiency, at the expense of an increased computational complexity compared with H.264. For intra prediction of 8K UHDTV real-time H.265 decoding, the joint complexity and throughput issue is more difficult to solve. Therefore, based on the divide-and-conquer strategy, we propose a new VLSI architecture in this paper, including two techniques, in order to achieve 8K UHDTV H.265 intra prediction decoding. The first technique is the LUT based Reference Sample Fetching Scheme (LUT-RSFS), reducing the number of reference samples in the worst case from 99 to 13. It further reduces the circuit area and enhances the performance. The second one is the Hybrid Block Reordering and Data Forwarding (HBRDF), minimizing the idle time and eliminating the dependency between TUs by creating 3 Data Forwarding paths. It achieves the hardware utilization of 94{\%}. Our design is synthesized using Synopsys Design Compiler in 40 nm process technology. It achieves an operation frequency of 260 MHz, with a gate count of 217.8K for 8-bit design, and 251.1K for 10-bit design. The proposed VLSI architecture can support 4320p@120 fps H.265 intra decoding (8-bit or 10-bit), with all 35 intra prediction modes and prediction unit sizes ranging from 4 × 4 to 64 × 64.",
keywords = "8K UHDTV, HEVC/H.265 decoder, Intra prediction, VLSI architecture",
author = "Jianbin Zhou and Dajiang Zhou and Shihao Wang and Takeshi Yoshimura and Satoshi Goto",
year = "2015",
month = "12",
day = "1",
doi = "10.1587/transfun.E98.A.2519",
language = "English",
volume = "E98A",
pages = "2519--2527",
journal = "IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences",
issn = "0916-8508",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "12",

}

TY - JOUR

T1 - High performance VLSI Architecture of H.265/HEVC intra prediction for 8K UHDTV video decoder

AU - Zhou, Jianbin

AU - Zhou, Dajiang

AU - Wang, Shihao

AU - Yoshimura, Takeshi

AU - Goto, Satoshi

PY - 2015/12/1

Y1 - 2015/12/1

N2 - 8K Ultra High Definition Television (UHDTV) requires extremely high throughput for video decoding based on H.265. In H.265, intra coding could significantly enhance video compression efficiency, at the expense of an increased computational complexity compared with H.264. For intra prediction of 8K UHDTV real-time H.265 decoding, the joint complexity and throughput issue is more difficult to solve. Therefore, based on the divide-and-conquer strategy, we propose a new VLSI architecture in this paper, including two techniques, in order to achieve 8K UHDTV H.265 intra prediction decoding. The first technique is the LUT based Reference Sample Fetching Scheme (LUT-RSFS), reducing the number of reference samples in the worst case from 99 to 13. It further reduces the circuit area and enhances the performance. The second one is the Hybrid Block Reordering and Data Forwarding (HBRDF), minimizing the idle time and eliminating the dependency between TUs by creating 3 Data Forwarding paths. It achieves the hardware utilization of 94%. Our design is synthesized using Synopsys Design Compiler in 40 nm process technology. It achieves an operation frequency of 260 MHz, with a gate count of 217.8K for 8-bit design, and 251.1K for 10-bit design. The proposed VLSI architecture can support 4320p@120 fps H.265 intra decoding (8-bit or 10-bit), with all 35 intra prediction modes and prediction unit sizes ranging from 4 × 4 to 64 × 64.

AB - 8K Ultra High Definition Television (UHDTV) requires extremely high throughput for video decoding based on H.265. In H.265, intra coding could significantly enhance video compression efficiency, at the expense of an increased computational complexity compared with H.264. For intra prediction of 8K UHDTV real-time H.265 decoding, the joint complexity and throughput issue is more difficult to solve. Therefore, based on the divide-and-conquer strategy, we propose a new VLSI architecture in this paper, including two techniques, in order to achieve 8K UHDTV H.265 intra prediction decoding. The first technique is the LUT based Reference Sample Fetching Scheme (LUT-RSFS), reducing the number of reference samples in the worst case from 99 to 13. It further reduces the circuit area and enhances the performance. The second one is the Hybrid Block Reordering and Data Forwarding (HBRDF), minimizing the idle time and eliminating the dependency between TUs by creating 3 Data Forwarding paths. It achieves the hardware utilization of 94%. Our design is synthesized using Synopsys Design Compiler in 40 nm process technology. It achieves an operation frequency of 260 MHz, with a gate count of 217.8K for 8-bit design, and 251.1K for 10-bit design. The proposed VLSI architecture can support 4320p@120 fps H.265 intra decoding (8-bit or 10-bit), with all 35 intra prediction modes and prediction unit sizes ranging from 4 × 4 to 64 × 64.

KW - 8K UHDTV

KW - HEVC/H.265 decoder

KW - Intra prediction

KW - VLSI architecture

UR - http://www.scopus.com/inward/record.url?scp=84948671227&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84948671227&partnerID=8YFLogxK

U2 - 10.1587/transfun.E98.A.2519

DO - 10.1587/transfun.E98.A.2519

M3 - Article

VL - E98A

SP - 2519

EP - 2527

JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

SN - 0916-8508

IS - 12

ER -