Antagonistic Video Generation Method Based on Multimodal Input

Yu Haitao; Yang Xiaoshan; Xu Changsheng

doi:10.7544/issn1000-1239.2020.20190479

Yu Haitao, Yang Xiaoshan, Xu Changsheng. Antagonistic Video Generation Method Based on Multimodal InputJ. Journal of Computer Research and Development, 2020, 57(7): 1522-1530. DOI: 10.7544/issn1000-1239.2020.20190479

Citation:

Antagonistic Video Generation Method Based on Multimodal Input

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Video generation is an important and challenging task in the field of computer vision and multimedia. The existing video generation methods based on generative adversarial networks (GANs) usually lack an effective scheme to control the coherence of video. The realization of artificial intelligence algorithms that can automatically generate real video is an important indicator of more complete visual appearance information and motion information understanding.A new multi-modal conditional video generation model is proposed in this paper. The model uses pictures and text as input, gets the motion information of video through text feature coding network and motion feature decoding network, and generates video with coherence motion by combining the input images. In addition, the method predicts video frames by affine transformation of input images, which makes the generated model more controllable and the generated results more robust. The experimental results on SBMG (single-digit bouncing MNIST gifs), TBMG(two-digit bouncing MNIST gifs) and KTH(kungliga tekniska hgskolan human actions) datasets show that the proposed method performs better on both the target clarity and the video coherence than existing methods. In addition, qualitative evaluation and quantitative evaluation of SSIM(structural similarity index) and PSNR(peak signal to noise ratio) metrics demonstrate that the proposed multi-modal video frame generation network plays a key role in the generation process.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Antagonistic Video Generation Method Based on Multimodal Input

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content