This is a position paper reporting an on-going collaboration project between SUNY Binghamton, USA, and Waseda University, Japan, on multimodal information retrieval through exploiting the cognitive synergy across the different modalities of the information to facilitate an effective retrieval. Specifically we focus on image retrieval in the applications where imagery data appear along with collateral text. It is noted that these applications are ubiquitous. We have proposed the Synergistic Indexing Scheme (SIS) to explicitly exploit the synergy between the information of imagery and text modalities. Since the synergy we have exploited between the information of imagery and text modalities is subjective and depends on specific cognitive context, we call this type of synergy as cognitive synergy. We have reported part of the empirical evaluation and are in the process to fully implement the SIS prototype for an extensive evaluation.