With the COVID19 pandemic, video streaming traffic is increasing rapidly. Especially, the live streaming traffic accounts for large amount due to the fact that many events have been switched to the online forms. Therefore, the demand to ensure a high-quality streaming experience is increasing urgently. Since the network condition is expected to fluctuate dynamically, the video streaming needs to be controlled adaptively according to the network condition to provide high quality of experience (QoE) for users. In this paper, a method was proposed to control the live video streaming using the actor-critic reinforcement learning (RL) technique. In this method, the historical video streaming logs such as throughput, buffer size, rebuffering time, latency are taken consideration as the states of RL, then the model is established to map the states to an action such as bitrate decision. In this study, the live streaming simulation is utilized to evaluate the method since the model needs training and the simulation can generate data much faster than real experiment. Experiments were conducted to evaluate the proposed method. Results demonstrate that the total QoE in Bus and Car scenarios show the best performance. The QoE of Tram case shows the lowest due to the low bandwidth.