Text Generation Image about Gan
- DOI
- 10.2991/ahis.k.220601.023How to use a DOI?
- Keywords
- Gan; Text Generation Research; Image Research; Gan’s Text Generation Image
- Abstract
In recent years, the task of text to image generation has been an important research hotspot in the field of computer vision and natural language. The purpose of this task is to take a descriptive language text as the input, and then output an image with consistent text content. Due to the instability of training such as gradient disappearance and mode collapse in the model training of the generated countermeasure network, and may cause problems such as the inconsistency between the final generated result and the text semantics or the diversity of generated content, based on previous research, this paper proposes Gan’s text generated image algorithm, which not only improves the stability of network training, but also improves the clarity of the image, Make the generated image more realistic. This paper studies the text to image generation based on Gan, discusses and analyzes the text to image generation network structure of GaN and the process of text to image generation; Combined with the function algorithm of text generated image of gan-int-cls, the is and vs scores of different models of gan-int-cls, gawwn, stackgan, stackgan + + and hdgan are tested. The results show that the is scores of the two data sets are greatly improved compared with the four methods of gan-int-cls, gawwn, stackgan and stackgan. Compared with hdgan, the is score on oxford-102 data set increased from 3.45 to 3.52. On the cub dataset, the is score increased from 4.18 to 4.33, and the score on the cub dataset reached 0.355, indicating that the hir-gan model has good effect and the generated image quality is higher. The hir-gan model has better image effect than the previous network model and matches the text description better. Through qualitative calculation, the is score and vs score of hir-gan have achieved the best results. Through the visualization experiment, it can also be seen that the image generated by hir-gan is the best and more in line with the semantic information described by the text.
- Copyright
- © 2022 The Authors. Published by Atlantis Press International B.V.
- Open Access
- This is an open access article distributed under the CC BY-NC 4.0 license.
Cite this article
TY - CONF AU - Sijia Ye PY - 2022 DA - 2022/06/02 TI - Text Generation Image about Gan BT - Proceedings of the 2021 International conference on Smart Technologies and Systems for Internet of Things (STS-IOT 2021) PB - Atlantis Press SP - 119 EP - 123 SN - 2589-4919 UR - https://doi.org/10.2991/ahis.k.220601.023 DO - 10.2991/ahis.k.220601.023 ID - Ye2022 ER -