32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080p real-time encoding application

Zhenyu Liu, Yang Song, Ming Shao, Shen Li, Ngfeng Li, Satoshi Goto, Takeshi Ikenaga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

H.264/AVC coding standard incorporates variable block size (VBS) motion estimation (ME) to improve the compression efficiency. For HDTV-1080p application, the massive computation and huge memory bandwidth by the large video frame size and the wide search range are two critical impediments to the real-time hardwired VB-SME engine design. In this paper, we present six techniques to circumvent these difficulties. First, the inter modes bellow 8 × 8 are eliminated in our design to reduce the hardware cost. Second, the low-pass filter based 4:1 down-sampling algorithm successfully reduces about 75% arithmetic computation in each search position. Third, the coarse to fine search scheme is made use of to reduce 25%-50% search candidates. Fourth, C+ memory organization is adopted to reduce the external IO bandwidth. Fifth, horizontal zigzag scan mode optimizes the search window memories. Finally, in circuit design, 4:2 compressor based CSA tree, multi-cycle path delay and 2 pipeline stage SAD tree techniques are utilized to improve the speed and reduce the hardware of each SAD tree. The hardwired integer motion estimation (IME) engine with 192 ×128 search range for HDTV1080p@30Hz is demonstrated in this paper. With TSMC 0.18μm 1P6M CMOS technology, it is implemented with 485.7k gates standard cells and 327.68k bit on chip memories. The power dissipation is 729mw at 200MHz clock speed.

Original languageEnglish
Title of host publicationIEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation
Pages675-680
Number of pages6
DOIs
Publication statusPublished - 2007
Event2007 IEEE Workshop on Signal Processing Systems, SiPS 2007 - Shanghai
Duration: 2007 Oct 172007 Oct 19

Other

Other2007 IEEE Workshop on Signal Processing Systems, SiPS 2007
CityShanghai
Period07/10/1707/10/19

Fingerprint

Motion estimation
Engines
Data storage equipment
Hardware
Bandwidth
Bellows
High definition television
Low pass filters
Compressors
Clocks
Energy dissipation
Pipelines
Sampling
Networks (circuits)
Costs

Keywords

  • H.264/AVC
  • HDTV1080p
  • Integer motion estimation
  • Variable block size
  • VLSI

ASJC Scopus subject areas

  • Media Technology
  • Signal Processing

Cite this

Liu, Z., Song, Y., Shao, M., Li, S., Li, N., Goto, S., & Ikenaga, T. (2007). 32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080p real-time encoding application. In IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation (pp. 675-680). [4387630] https://doi.org/10.1109/SIPS.2007.4387630

32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080p real-time encoding application. / Liu, Zhenyu; Song, Yang; Shao, Ming; Li, Shen; Li, Ngfeng; Goto, Satoshi; Ikenaga, Takeshi.

IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation. 2007. p. 675-680 4387630.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Liu, Z, Song, Y, Shao, M, Li, S, Li, N, Goto, S & Ikenaga, T 2007, 32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080p real-time encoding application. in IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation., 4387630, pp. 675-680, 2007 IEEE Workshop on Signal Processing Systems, SiPS 2007, Shanghai, 07/10/17. https://doi.org/10.1109/SIPS.2007.4387630
Liu Z, Song Y, Shao M, Li S, Li N, Goto S et al. 32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080p real-time encoding application. In IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation. 2007. p. 675-680. 4387630 https://doi.org/10.1109/SIPS.2007.4387630
Liu, Zhenyu ; Song, Yang ; Shao, Ming ; Li, Shen ; Li, Ngfeng ; Goto, Satoshi ; Ikenaga, Takeshi. / 32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080p real-time encoding application. IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation. 2007. pp. 675-680
@inproceedings{d5fd165a2d5c4e148c7fca25efb1265e,
title = "32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080p real-time encoding application",
abstract = "H.264/AVC coding standard incorporates variable block size (VBS) motion estimation (ME) to improve the compression efficiency. For HDTV-1080p application, the massive computation and huge memory bandwidth by the large video frame size and the wide search range are two critical impediments to the real-time hardwired VB-SME engine design. In this paper, we present six techniques to circumvent these difficulties. First, the inter modes bellow 8 × 8 are eliminated in our design to reduce the hardware cost. Second, the low-pass filter based 4:1 down-sampling algorithm successfully reduces about 75{\%} arithmetic computation in each search position. Third, the coarse to fine search scheme is made use of to reduce 25{\%}-50{\%} search candidates. Fourth, C+ memory organization is adopted to reduce the external IO bandwidth. Fifth, horizontal zigzag scan mode optimizes the search window memories. Finally, in circuit design, 4:2 compressor based CSA tree, multi-cycle path delay and 2 pipeline stage SAD tree techniques are utilized to improve the speed and reduce the hardware of each SAD tree. The hardwired integer motion estimation (IME) engine with 192 ×128 search range for HDTV1080p@30Hz is demonstrated in this paper. With TSMC 0.18μm 1P6M CMOS technology, it is implemented with 485.7k gates standard cells and 327.68k bit on chip memories. The power dissipation is 729mw at 200MHz clock speed.",
keywords = "H.264/AVC, HDTV1080p, Integer motion estimation, Variable block size, VLSI",
author = "Zhenyu Liu and Yang Song and Ming Shao and Shen Li and Ngfeng Li and Satoshi Goto and Takeshi Ikenaga",
year = "2007",
doi = "10.1109/SIPS.2007.4387630",
language = "English",
isbn = "1424412226",
pages = "675--680",
booktitle = "IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation",

}

TY - GEN

T1 - 32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080p real-time encoding application

AU - Liu, Zhenyu

AU - Song, Yang

AU - Shao, Ming

AU - Li, Shen

AU - Li, Ngfeng

AU - Goto, Satoshi

AU - Ikenaga, Takeshi

PY - 2007

Y1 - 2007

N2 - H.264/AVC coding standard incorporates variable block size (VBS) motion estimation (ME) to improve the compression efficiency. For HDTV-1080p application, the massive computation and huge memory bandwidth by the large video frame size and the wide search range are two critical impediments to the real-time hardwired VB-SME engine design. In this paper, we present six techniques to circumvent these difficulties. First, the inter modes bellow 8 × 8 are eliminated in our design to reduce the hardware cost. Second, the low-pass filter based 4:1 down-sampling algorithm successfully reduces about 75% arithmetic computation in each search position. Third, the coarse to fine search scheme is made use of to reduce 25%-50% search candidates. Fourth, C+ memory organization is adopted to reduce the external IO bandwidth. Fifth, horizontal zigzag scan mode optimizes the search window memories. Finally, in circuit design, 4:2 compressor based CSA tree, multi-cycle path delay and 2 pipeline stage SAD tree techniques are utilized to improve the speed and reduce the hardware of each SAD tree. The hardwired integer motion estimation (IME) engine with 192 ×128 search range for HDTV1080p@30Hz is demonstrated in this paper. With TSMC 0.18μm 1P6M CMOS technology, it is implemented with 485.7k gates standard cells and 327.68k bit on chip memories. The power dissipation is 729mw at 200MHz clock speed.

AB - H.264/AVC coding standard incorporates variable block size (VBS) motion estimation (ME) to improve the compression efficiency. For HDTV-1080p application, the massive computation and huge memory bandwidth by the large video frame size and the wide search range are two critical impediments to the real-time hardwired VB-SME engine design. In this paper, we present six techniques to circumvent these difficulties. First, the inter modes bellow 8 × 8 are eliminated in our design to reduce the hardware cost. Second, the low-pass filter based 4:1 down-sampling algorithm successfully reduces about 75% arithmetic computation in each search position. Third, the coarse to fine search scheme is made use of to reduce 25%-50% search candidates. Fourth, C+ memory organization is adopted to reduce the external IO bandwidth. Fifth, horizontal zigzag scan mode optimizes the search window memories. Finally, in circuit design, 4:2 compressor based CSA tree, multi-cycle path delay and 2 pipeline stage SAD tree techniques are utilized to improve the speed and reduce the hardware of each SAD tree. The hardwired integer motion estimation (IME) engine with 192 ×128 search range for HDTV1080p@30Hz is demonstrated in this paper. With TSMC 0.18μm 1P6M CMOS technology, it is implemented with 485.7k gates standard cells and 327.68k bit on chip memories. The power dissipation is 729mw at 200MHz clock speed.

KW - H.264/AVC

KW - HDTV1080p

KW - Integer motion estimation

KW - Variable block size

KW - VLSI

UR - http://www.scopus.com/inward/record.url?scp=47949087862&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=47949087862&partnerID=8YFLogxK

U2 - 10.1109/SIPS.2007.4387630

DO - 10.1109/SIPS.2007.4387630

M3 - Conference contribution

SN - 1424412226

SN - 9781424412228

SP - 675

EP - 680

BT - IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation

ER -