In order to realize automatic recognition of surgical processes in surgical brain tumor removal using microscopic camera, we propose a method of detecting and tracking surgical tools by video analysis. The proposed method consists of a detection part and tracking part. In the detection part, object detection is performed for each frame of surgery video, and the category and bounding box are acquired frame by frame. The convolution layer strengthens the robustness using data augmentation (central cropping and random erasing). The tracking part uses SORT, which predicts and updates the acquired bounding box corrected by using Kalman Filter; next, the object ID is assigned to each corrected bounding box using the Hungarian algorithm. The accuracy of our proposed method is very high as follows. As a result of experiments on spatial detection. the mean average precision is 90.58%. the mean accuracy of frame label detection is 96.58%. These results are very promising for surgical phase recognition.