The Investigation of DeiT model Based on PaddlePaddle Framework on CIFAR-10 Dataset Image Classification
- DOI
- 10.2991/978-94-6463-300-9_106How to use a DOI?
- Keywords
- DeiT; CIFAR-10; PaddlePaddle
- Abstract
Image classification is one of the important classifications in the field of computer vision, and the development of deep learning models has brought historic breakthroughs to the development of image classification. Transformer model, as a powerful sequence modeling tool, has achieved great success in natural language processing. Recently, the application of the Transformer model to image classification tasks has also achieved significant results. Distilled-Enhanced-Transformer (DeiT) model is one of the representative models, which realizes the goal of efficient image classification on small data sets by introducing self-attention mechanism and Transformer architecture. The core idea of DeiT model is to use self-attention mechanism to establish the global context of input image. Traditional convolutional neural networks capture image features through local receptive field and hierarchical structure when processing images. The DeiT model, on the other hand, processes the image in chunks, breaking it into small patches and feeding them into Transformer as a sequence. In this way, the DeiT model is able to model each patch using self-attention mechanisms to capture more global image features. In the experiment, this study used the DeiT model provided in the PaddlePaddle 2.0 framework to perform an image classification task on the CIFAR-10 dataset. The CIFAR-10 dataset contains 60,000 32x32 color images from 10 different categories, 50,000 for training and 10,000 for testing. This study trained the model using a stochastic gradient descent (SGD) optimizer and a cross entropy loss function. In order to improve the generalization ability of the model, this study also use data enhancement techniques such as random cropping, flipping and rotation.
- Copyright
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Yuda Li PY - 2023 DA - 2023/11/27 TI - The Investigation of DeiT model Based on PaddlePaddle Framework on CIFAR-10 Dataset Image Classification BT - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023) PB - Atlantis Press SP - 1062 EP - 1067 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-300-9_106 DO - 10.2991/978-94-6463-300-9_106 ID - Li2023 ER -