Audio Translation with Conditional Generative Adversarial Networks

Ahmad Moussa, Hiroshi Watanabe

研究成果: Conference contribution

抄録

This paper explores the applicability of conditional generative adversarial networks in audio-to-audio translation problems and proposes a neural network architecture capable of doing so. Recent advances have shown that causal convolutions can be effective for modeling raw audio when their kernel is dilated by many factors, in contrast to previous techniques that utilized recurrent approaches. Embedding such convolutions within a conditional GAN architecture allows the targeted generation of raw audio given a certain input. This architecture can then be used to learn and simulate certain translative operations applied to an input signal. This creates the defined problem of converting one audio signal into another, which has different characteristics. We also propose a novel discriminator structure for the evaluation of generated audio.

本文言語English
ホスト出版物のタイトル2020 International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020
出版社Institute of Electrical and Electronics Engineers Inc.
ページ438-442
ページ数5
ISBN(電子版)9781728149851
DOI
出版ステータスPublished - 2020 2月
イベント2nd International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020 - Fukuoka, Japan
継続期間: 2020 2月 192020 2月 21

出版物シリーズ

名前2020 International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020

Conference

Conference2nd International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020
国/地域Japan
CityFukuoka
Period20/2/1920/2/21

ASJC Scopus subject areas

  • 情報システムおよび情報管理
  • 人工知能
  • コンピュータ ネットワークおよび通信
  • コンピュータ ビジョンおよびパターン認識
  • 情報システム
  • 信号処理

フィンガープリント

「Audio Translation with Conditional Generative Adversarial Networks」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル