Variable block size motion estimation developed by the latest video coding standard H.264/AVC is the efficient approach to reduce the temporal redundancies. The intensive computational complexity coming from the variable block size technique makes the hardwired accelerator essential, for real-time applications. Propagate partial sums of absolute differences (Propagate Partial SAD) and SAD Tree hardwired engines outperform other counterparts, especially considering the impact of supporting variable block size technique. In this paper, the authors apply the architecture-level and the circuit-level approaches to improve the maximum operating frequency and reduce the hardware overhead of Propagate Partial SAD and SAD Tree, while other metrics, in terms of latency, memory bandwidth and hardware utilization, of the original architectures are maintained. Experiments demonstrate that by using the proposed approaches, at 110.8MHz operating frequency, compared with the original architectures, 14.7% and 18.0% gate count can be saved for Propagate Partial SAD and SAD Tree, respectively. With TSMC 0.18 μm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture achieves 231.6MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the maximum work frequency of the optimized SAD Tree architecture is improved to 204.8MHz, which is almost two times of the original one, while its hardware overhead is merely 88.5 k-gate.
|ジャーナル||IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences|
|出版物ステータス||Published - 2010 11|
ASJC Scopus subject areas
- Signal Processing
- Computer Graphics and Computer-Aided Design
- Electrical and Electronic Engineering
- Applied Mathematics