In this paper, we present a cache based motion compensation (MC) architecture for Quad-HD H.264/AVC video decoder. With the significantly increased throughput requirement, VLSI design for MC is greatly challenged by the huge area cost and power consumption. Moreover, the long memory system latency leads to performance drop of the MC pipeline. To solve these problems, three optimization schemes are proposed in this work. Firstly, a high-performance interpolator based on Horizontal-Vertical Expansion and Luma-Chroma Parallelism (HVE-LCP) is proposed to efficiently increase the processing throughput to at least over 4 times as the previous designs. Secondly, an efficient cache memory organization scheme (4S × 4) is adopted to improve the on-chip memory utilization, which contributes to memory area saving of 25% and memory power saving of 39 ∼ 49%. Finally, by employing a Split Task Queue (STQ) architecture, the cache system is capable of tolerating much longer latency of the memory system. Consequently, the cache idle time is saved by 90%, which contributes to reducing the overall processing time by 24 ∼ 40%. When implemented with SMIC 90 nm process, this design costs a logic gate count and on-chip memory of 108.8k and 3.1kB respectively. The proposed MC architecture can support real-time processing of 3840 × 2160@60 fps with less than 166 MHz.
ASJC Scopus subject areas