As an important subtask of video restoration, video super-resolution has attracted a lot of attention in the community as it can eventually promote a wide range of technologies, e.g., video transmission system. Recent video super resolution model1 achieves cutting-edge performance. It efficiently utilizes recurrent architecture with neural networks to gradually aggregate details from previous frames. Nevertheless, this method faces a serious drawback that it is sensitive to occlusion, blur, and large motion changes since it only takes the previous generated output as recurrent input for the super resolution model. This will lead to undesirable rapid information loss during the recurrently generating process, and performance will therefore be dramatically decreased. Our works focus on addressing the issue of rapid information loss in video super-resolution model with recurrent architecture. By producing attention maps through selective fusion module, the recurrent model can adaptively aggregate necessary details across all previously generated high-resolution (HR) frames according to their informativeness. The proposed method is useful for preserving high frequency details collected progressively from each frame while being capable of removing noisy artifacts. This significantly improves the average quality of the super resolution video.