3D human pose estimation plays important roles in various human-machine interactive applications, but how to efficiently utilize the joint structural global and local features of human pose in deep-learning-based methods has always been a challenge. In this paper, we propose a parallel structural global and local joint features fusion network based on inspiring observation pattern of human pose. To be specific, it is observed that there are common similar global features and local features in human pose cross actions. Therefore, we design global-local capture modules separately to capture features and finally fuse them. The proposed parallel global and local joint features fusion network, entitled JointFusionNet, significantly improve state-of-the-art models on both intra-scenario H36M and cross-scenario 3DPW datasets and lead to appreciable improvements in poses with more similar local features. Notably, it yields an overall improvement of 3.4 mm in MPJPE (relative 6.8 % improvement) over the previous best feature fusion based method  on H36M dataset in 3D human pose estimation.