We consider the problem of reused answer retrieval for community question answering (CQA): given a question q, retrieve answers (Formula presented) posted in response to other questions qi(≠ q), where (Formula presented) serves as an answer to q. While previous work evaluated this task by manually annotating the relationship between q and (Formula presented), this approach does not scale for large-scale CQA sites. We therefore explore an automatic evaluation method for reused answer retrieval, which computes nDCG by defining the gain value of each retrieved answer as a ROUGE score that treats the original answers to q as gold summaries. Our answer retrieval experiment suggests that effective reused answer retrieval systems may not be the same as effective gold answer retrieval systems. We provide case studies to discuss the benefits and limitations of our approach.