Generating video from single image and sound

Yukitaka Tsuchiya, Takahiro Itazuri, Ryota Natsume, Shintaro Yamamoto, Takuya Kato, Shigeo Morishima

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In this paper, we propose a method of generating a video linked to sound from a single image and a few seconds of sound while maintaining the appearance of the image. Conventional video generation methods from sound require key points extraction related to the sound in each object, such as the mouth in speech and arms in musical instrument performance. They can not be applied to objects whose shape changes significantly like fireworks. The proposed method can generate a video without extracting specific key points from images. We experimented not only the mouth shape and body pose of human treated in the conventional ways, but also fireworks and sea waves where it is difficult to design key points.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2019
PublisherIEEE Computer Society
Pages17-20
Number of pages4
ISBN (Electronic)9781728125060
Publication statusPublished - 2019 Jun
Event32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2019 - Long Beach, United States
Duration: 2019 Jun 162019 Jun 20

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
Volume2019-June
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2019
Country/TerritoryUnited States
CityLong Beach
Period19/6/1619/6/20

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Generating video from single image and sound'. Together they form a unique fingerprint.

Cite this