In this paper, a joint team from Waseda University and Meisei University (team name: Waseda-Meisei) report their efforts on the ad-hoc video search (AVS) task for the TRECVID benchmark, which is conducted annually by the National Institute of Standards and Technology (NIST). For the AVS task, a system is required to perform a fine-grained search of target videos from a large-scale video database using a query phrase including multiple keywords, such as objects, persons, scenes, and actions. The system we submitted has the following two characteristics. First, to improve the coverage rate of classes corresponding to keywords in query phrases, we prepared a large number of classifiers that can detect objects, persons, scenes, and actions, which were trained using various image and video datasets. Second, when choosing a concept classifier corresponding to a keyword, we introduced a mechanism that allows us to select additional concept classifiers by incorporating natural language processing techniques. We submitted multiple systems with these characteristics to the TRECVID 2017 AVS task and one of our systems ranked the highest among all the submitted systems from 22 teams.