The Investigation of DeiT model Based on PaddlePaddle Framework on CIFAR-10 Dataset Image Classification

Yuda Li

doi:10.2991/978-94-6463-300-9_106

<Previous Article In Volume

The Investigation of DeiT model Based on PaddlePaddle Framework on CIFAR-10 Dataset Image Classification

Authors

Yuda Li¹^{, *}

¹Computer Science and Technology, Beijing Jiaotong University (Weihai Campus), Shandong, 264401, China

^*Corresponding author. Email: 20722089@bjtu.edu.cn

Corresponding Author

Yuda Li

Available Online 27 November 2023.

DOI: 10.2991/978-94-6463-300-9_106 How to use a DOI?
Keywords: DeiT; CIFAR-10; PaddlePaddle
Abstract: Image classification is one of the important classifications in the field of computer vision, and the development of deep learning models has brought historic breakthroughs to the development of image classification. Transformer model, as a powerful sequence modeling tool, has achieved great success in natural language processing. Recently, the application of the Transformer model to image classification tasks has also achieved significant results. Distilled-Enhanced-Transformer (DeiT) model is one of the representative models, which realizes the goal of efficient image classification on small data sets by introducing self-attention mechanism and Transformer architecture. The core idea of DeiT model is to use self-attention mechanism to establish the global context of input image. Traditional convolutional neural networks capture image features through local receptive field and hierarchical structure when processing images. The DeiT model, on the other hand, processes the image in chunks, breaking it into small patches and feeding them into Transformer as a sequence. In this way, the DeiT model is able to model each patch using self-attention mechanisms to capture more global image features. In the experiment, this study used the DeiT model provided in the PaddlePaddle 2.0 framework to perform an image classification task on the CIFAR-10 dataset. The CIFAR-10 dataset contains 60,000 32x32 color images from 10 different categories, 50,000 for training and 10,000 for testing. This study trained the model using a stochastic gradient descent (SGD) optimizer and a cross entropy loss function. In order to improve the generalization ability of the model, this study also use data enhancement techniques such as random cropping, flipping and rotation.
Copyright: © 2023 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Volume Title: Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
Series: Advances in Computer Science Research
Publication Date: 27 November 2023
ISBN: 10.2991/978-94-6463-300-9_106
ISSN: 2352-538X
DOI: 10.2991/978-94-6463-300-9_106 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Yuda Li
PY  - 2023
DA  - 2023/11/27
TI  - The Investigation of DeiT model Based on PaddlePaddle Framework on CIFAR-10 Dataset Image Classification
BT  - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
PB  - Atlantis Press
SP  - 1062
EP  - 1067
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-300-9_106
DO  - 10.2991/978-94-6463-300-9_106
ID  - Li2023
ER  -

download .riscopy to clipboard