To avoid manual collections of a huge amount of labeled image data needed for training autonomous driving models, this paper proposes a novel automatic method for collecting image data with annotation for autonomous driving through a translation network that can transform the simulation CG images to real-world images. The translation network is designed in an end-to-end structure that contains two encoder-decoder networks. The forepart of the translation network is designed to represent the structure of the original simulation CG image with a semantic segmentation. Then the rear part of the network translates the segmentation to a real-world image by applying cGAN. After the training, the translation network can learn a mapping from simulation CG pixels to the real-world image pixels. To confirm the validity of the proposed system, we conducted three experiments under different learning policies by evaluating the MSE of the steering angle and vehicle speed. The first experiment demonstrates that the L1+cGAN performs best above all loss functions in the translation network. As a result of the second experiment conducted under different learning policies, it turns out that the ResNet architecture works best. The third experiment demonstrates that the model trained with the real-world images generated by the translation network can still work great in the real world. All the experimental results demonstrate the validity of our proposed method.
|ジャーナル||IS and T International Symposium on Electronic Imaging Science and Technology|
|出版ステータス||Published - 2021|
|イベント||2021 3D Imaging and Applications, 3DIA 2021 - Virtual, Online, United States|
継続期間: 2021 1 11 → 2021 1 28
ASJC Scopus subject areas
- コンピュータ グラフィックスおよびコンピュータ支援設計
- コンピュータ サイエンスの応用