Audio Translation with Conditional Generative Adversarial Networks

Ahmad Moussa, Hiroshi Watanabe

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper explores the applicability of conditional generative adversarial networks in audio-to-audio translation problems and proposes a neural network architecture capable of doing so. Recent advances have shown that causal convolutions can be effective for modeling raw audio when their kernel is dilated by many factors, in contrast to previous techniques that utilized recurrent approaches. Embedding such convolutions within a conditional GAN architecture allows the targeted generation of raw audio given a certain input. This architecture can then be used to learn and simulate certain translative operations applied to an input signal. This creates the defined problem of converting one audio signal into another, which has different characteristics. We also propose a novel discriminator structure for the evaluation of generated audio.

Original languageEnglish
Title of host publication2020 International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages438-442
Number of pages5
ISBN (Electronic)9781728149851
DOIs
Publication statusPublished - 2020 Feb
Event2nd International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020 - Fukuoka, Japan
Duration: 2020 Feb 192020 Feb 21

Publication series

Name2020 International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020

Conference

Conference2nd International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020
Country/TerritoryJapan
CityFukuoka
Period20/2/1920/2/21

Keywords

  • audio effects
  • causal dilated convolutions
  • conditional generative adversarial networks
  • signal processing

ASJC Scopus subject areas

  • Information Systems and Management
  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Audio Translation with Conditional Generative Adversarial Networks'. Together they form a unique fingerprint.

Cite this