Text Generation Image about Gan

Sijia Ye

doi:10.2991/ahis.k.220601.023

<Previous Article In Volume

Next Article In Volume>

Text Generation Image about Gan

Authors

Sijia Ye^*

Chongqing Normal University, Chongqing, China

^*Corresponding author. Email: Ysjyesijia@163.com

Corresponding Author

Sijia Ye

Available Online 2 June 2022.

DOI: 10.2991/ahis.k.220601.023 How to use a DOI?
Keywords: Gan; Text Generation Research; Image Research; Gan’s Text Generation Image
Abstract: In recent years, the task of text to image generation has been an important research hotspot in the field of computer vision and natural language. The purpose of this task is to take a descriptive language text as the input, and then output an image with consistent text content. Due to the instability of training such as gradient disappearance and mode collapse in the model training of the generated countermeasure network, and may cause problems such as the inconsistency between the final generated result and the text semantics or the diversity of generated content, based on previous research, this paper proposes Gan’s text generated image algorithm, which not only improves the stability of network training, but also improves the clarity of the image, Make the generated image more realistic. This paper studies the text to image generation based on Gan, discusses and analyzes the text to image generation network structure of GaN and the process of text to image generation; Combined with the function algorithm of text generated image of gan-int-cls, the is and vs scores of different models of gan-int-cls, gawwn, stackgan, stackgan + + and hdgan are tested. The results show that the is scores of the two data sets are greatly improved compared with the four methods of gan-int-cls, gawwn, stackgan and stackgan. Compared with hdgan, the is score on oxford-102 data set increased from 3.45 to 3.52. On the cub dataset, the is score increased from 4.18 to 4.33, and the score on the cub dataset reached 0.355, indicating that the hir-gan model has good effect and the generated image quality is higher. The hir-gan model has better image effect than the previous network model and matches the text description better. Through qualitative calculation, the is score and vs score of hir-gan have achieved the best results. Through the visualization experiment, it can also be seen that the image generated by hir-gan is the best and more in line with the semantic information described by the text.
Copyright: © 2022 The Authors. Published by Atlantis Press International B.V.
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2021 International conference on Smart Technologies and Systems for Internet of Things (STS-IOT 2021)
Series: Atlantis Highlights in Intelligent Systems
Publication Date: 2 June 2022
ISBN: 978-94-6239-581-7
ISSN: 2589-4919
DOI: 10.2991/ahis.k.220601.023 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license.

Cite this article

ris enw bib

TY  - CONF
AU  - Sijia Ye
PY  - 2022
DA  - 2022/06/02
TI  - Text Generation Image about Gan
BT  - Proceedings of the 2021 International conference on Smart Technologies and Systems for Internet of Things (STS-IOT 2021)
PB  - Atlantis Press
SP  - 119
EP  - 123
SN  - 2589-4919
UR  - https://doi.org/10.2991/ahis.k.220601.023
DO  - 10.2991/ahis.k.220601.023
ID  - Ye2022
ER  -

download .riscopy to clipboard