An Investigation into Hyperparameter Adjustment and Learning Rate Optimization Algorithm Utilizing Normal Distribution and Greedy Heuristics in Parallel Training

Weixuan Qiao

doi:10.2991/978-94-6463-300-9_16

<Previous Article In Volume

Next Article In Volume>

An Investigation into Hyperparameter Adjustment and Learning Rate Optimization Algorithm Utilizing Normal Distribution and Greedy Heuristics in Parallel Training

Authors

Weixuan Qiao¹^{, *}

¹Software engineering, Harbin University of Science and Technology, , Weihai, Shandong, 264300, China

^*Corresponding author. Email: 2030090319@stu.hrbust.edu.cn

Corresponding Author

Weixuan Qiao

Available Online 27 November 2023.

DOI: 10.2991/978-94-6463-300-9_16 How to use a DOI?
Keywords: Data Parallel; Deep Learning; LRSA
Abstract: In the era of large models, traditional training methods can no longer meet the massive requirements of computing power and data sets. Using distributed training can alleviate this problem to some extent. However, in distributed training, the challenge and complexity of hyperparameter adjustment will increase. In order to solve these challenges, special distributed hyperparameter adjustment algorithms and strategies are needed to find the best hyperparameter combination faster. This paper proposed Learning Rate Search Algorithm (LRSA) to quickly determine the initial value of learning rate, which makes the adjustment of hyperparameter more efficient. This work analyzed the reason why the parallel speed of multi card data decreases is that Using data parallelism at a small batch size can lead to resources being mainly used for communication and related overhead between GPUs, resulting in lower effective utilization of GPUs and an increase in training duration. Furtherly, this paper explored the reasons for the decrease in accuracy under large batch size and used LRSA to improve this situation effectively. This article also proposed an empirical rule to determine the lower bound of batch size. Experimental results indicate that LRSA on several deep learning models demonstrated that they can improve training efficiency and accuracy. For example, LRSA was used in VGG16 to get a suitable learning rate so that the model has the same Accuracy as the smaller batchsize at a faster training speed.
Copyright: © 2023 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
Series: Advances in Computer Science Research
Publication Date: 27 November 2023
ISBN: 10.2991/978-94-6463-300-9_16
ISSN: 2352-538X
DOI: 10.2991/978-94-6463-300-9_16 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Weixuan Qiao
PY  - 2023
DA  - 2023/11/27
TI  - An Investigation into Hyperparameter Adjustment and Learning Rate Optimization Algorithm Utilizing Normal Distribution and Greedy Heuristics in Parallel Training
BT  - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
PB  - Atlantis Press
SP  - 153
EP  - 163
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-300-9_16
DO  - 10.2991/978-94-6463-300-9_16
ID  - Qiao2023
ER  -

download .riscopy to clipboard