A bandwidth optimized, 64 cycles/MB joint parameter decoder architecture for ultra high definition H.264/AVC applications

Jinjia Zhou, Dajiang Zhou, Xun He, Satoshi Goto

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

In this paper, VLSI architecture of a joint parameter decoder is proposed to realize the calculation of motion vector (MV), intra prediction mode (IPM) and boundary strength (BS) for ultra high definition H.264/AVC applications. For this architecture, a 64-cycle-per-MB pipeline with simplified control modes is designed to increase system throughput and reduce hardware cost. Moreover, in order to save memory bandwidth, the data which includes the motion information for the co-located picture and the last decoded line, is pre-processed before being stored to DRAM. A partition based storage format is applied to condense the MB level data, while variable length coding based compression method is utilized to reduce the data size in each partition. Experimental results show our design is capable of real-time 3840×2160@60 fps decoding at less than 133 MHz, with 37.2 k logic gates. Meanwhile, by applying the proposed scheme, 85-98% bandwidth saving is achieved, compared with storing the original information for every 4 × 4 block to DRAM.

Original languageEnglish
Pages (from-to)1425-1433
Number of pages9
JournalIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
VolumeE93-A
Issue number8
DOIs
Publication statusPublished - 2010

Fingerprint

Dynamic random access storage
Bandwidth
Cycle
Logic gates
Partition
Intra Prediction
VLSI Architecture
Computer hardware
Decoding
Motion Vector
Pipelines
Throughput
Data storage equipment
Compression
Coding
Hardware
Logic
Real-time
Costs
Motion

Keywords

  • DRAM bandwidth
  • H.264/av c
  • Motion vector derivation
  • Ultra high resolution
  • Video decoder

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design
  • Applied Mathematics
  • Signal Processing

Cite this

A bandwidth optimized, 64 cycles/MB joint parameter decoder architecture for ultra high definition H.264/AVC applications. / Zhou, Jinjia; Zhou, Dajiang; He, Xun; Goto, Satoshi.

In: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E93-A, No. 8, 2010, p. 1425-1433.

Research output: Contribution to journalArticle

@article{620f7a9302674a989164174d9ee6eae0,
title = "A bandwidth optimized, 64 cycles/MB joint parameter decoder architecture for ultra high definition H.264/AVC applications",
abstract = "In this paper, VLSI architecture of a joint parameter decoder is proposed to realize the calculation of motion vector (MV), intra prediction mode (IPM) and boundary strength (BS) for ultra high definition H.264/AVC applications. For this architecture, a 64-cycle-per-MB pipeline with simplified control modes is designed to increase system throughput and reduce hardware cost. Moreover, in order to save memory bandwidth, the data which includes the motion information for the co-located picture and the last decoded line, is pre-processed before being stored to DRAM. A partition based storage format is applied to condense the MB level data, while variable length coding based compression method is utilized to reduce the data size in each partition. Experimental results show our design is capable of real-time 3840×2160@60 fps decoding at less than 133 MHz, with 37.2 k logic gates. Meanwhile, by applying the proposed scheme, 85-98{\%} bandwidth saving is achieved, compared with storing the original information for every 4 × 4 block to DRAM.",
keywords = "DRAM bandwidth, H.264/av c, Motion vector derivation, Ultra high resolution, Video decoder",
author = "Jinjia Zhou and Dajiang Zhou and Xun He and Satoshi Goto",
year = "2010",
doi = "10.1587/transfun.E93.A.1425",
language = "English",
volume = "E93-A",
pages = "1425--1433",
journal = "IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences",
issn = "0916-8508",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "8",

}

TY - JOUR

T1 - A bandwidth optimized, 64 cycles/MB joint parameter decoder architecture for ultra high definition H.264/AVC applications

AU - Zhou, Jinjia

AU - Zhou, Dajiang

AU - He, Xun

AU - Goto, Satoshi

PY - 2010

Y1 - 2010

N2 - In this paper, VLSI architecture of a joint parameter decoder is proposed to realize the calculation of motion vector (MV), intra prediction mode (IPM) and boundary strength (BS) for ultra high definition H.264/AVC applications. For this architecture, a 64-cycle-per-MB pipeline with simplified control modes is designed to increase system throughput and reduce hardware cost. Moreover, in order to save memory bandwidth, the data which includes the motion information for the co-located picture and the last decoded line, is pre-processed before being stored to DRAM. A partition based storage format is applied to condense the MB level data, while variable length coding based compression method is utilized to reduce the data size in each partition. Experimental results show our design is capable of real-time 3840×2160@60 fps decoding at less than 133 MHz, with 37.2 k logic gates. Meanwhile, by applying the proposed scheme, 85-98% bandwidth saving is achieved, compared with storing the original information for every 4 × 4 block to DRAM.

AB - In this paper, VLSI architecture of a joint parameter decoder is proposed to realize the calculation of motion vector (MV), intra prediction mode (IPM) and boundary strength (BS) for ultra high definition H.264/AVC applications. For this architecture, a 64-cycle-per-MB pipeline with simplified control modes is designed to increase system throughput and reduce hardware cost. Moreover, in order to save memory bandwidth, the data which includes the motion information for the co-located picture and the last decoded line, is pre-processed before being stored to DRAM. A partition based storage format is applied to condense the MB level data, while variable length coding based compression method is utilized to reduce the data size in each partition. Experimental results show our design is capable of real-time 3840×2160@60 fps decoding at less than 133 MHz, with 37.2 k logic gates. Meanwhile, by applying the proposed scheme, 85-98% bandwidth saving is achieved, compared with storing the original information for every 4 × 4 block to DRAM.

KW - DRAM bandwidth

KW - H.264/av c

KW - Motion vector derivation

KW - Ultra high resolution

KW - Video decoder

UR - http://www.scopus.com/inward/record.url?scp=77955386547&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77955386547&partnerID=8YFLogxK

U2 - 10.1587/transfun.E93.A.1425

DO - 10.1587/transfun.E93.A.1425

M3 - Article

AN - SCOPUS:77955386547

VL - E93-A

SP - 1425

EP - 1433

JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

SN - 0916-8508

IS - 8

ER -