Because of the proliferation of surveillance camera and the wide range of its utilization, 'Person Re-identification' technology has been drawing attention. However, the issues such as differences in person's appearances depending on their wearing items, clothes and behaviors still remain. Therefore, in this paper, we propose a two-stream feature-fusion architecture to improve the re-identification accuracy, where spatio-temporal features of partial body images, that we conceive to represent person's individuality robust to such differences, and the corresponding entire images, by applying convolutional LSTM and 3D CNN. The evaluation using the MARS dataset shows that the feet features are most effective among the four horizontally-split partial body images. And the CMS (Cumulative Match Score) by convolutional LSTM applied to the feet features in the proposed architecture is higher than the existing method which applies CNN and temporal pooling only to the entire images. The results show that it is effective to additionally use spatio-temporal features of feet in the MARS dataset.