Proceedings of the 2024 2nd International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2024)

Session: Computer Vision, NLP, and AI Innovations

49 articles
Proceedings Article

Comparison of Adversarial Robustness of Convolutional Neural Networks for Handwritten Digit Recognition

Zhen Ren
Machine learning has found widespread application in contemporary society, yet it remains vulnerable to the corrosive effects of adversarial samples. These refer to input data that has been deliberately modified in a certain way to mislead machine learning models. While these modifications may be undetectable...
Proceedings Article

Research on speech recognition and its application in language disorders

Kangbo Wei
Since ancient times, language has been a fundamental medium for human communication and the expression of thoughts. The advancement of speech recognition technology has significantly enhanced the efficiency of generating, transmitting, storing, and acquiring speech information, thereby facilitating more...
Proceedings Article

Enhancing Image Segmentation for ICH through Transfer Learning from Stroke MRI to ICH CT

Lianghan Dong
A serious brain condition with a high death rate, intracranial hemorrhage (ICH) requires a precise and timely diagnosis. While Computed Tomography (CT) is commonly used for its speed and accessibility, its diagnostic accuracy is limited compared to Magnetic Resonance Imaging (MRI). However, the latter...
Proceedings Article

The Investigation and Discussion Related to Recommendation Systems in Video Social Platforms

Rongxuan Zhang
With the increasing popularity of video social platforms, recommendation systems play a crucial role on these platforms. They can recommend content of interest to users based on their interests and preferences, greatly improving their content browsing experience. This article provides an in-depth analysis...
Proceedings Article

Enhancing IoT Security Through Trusted Execution Environments

Zheng Zhang
As the Internet of Things (IoT) proliferates, securing these interconnected devices has become a critical concern. Trusted Execution Environments (TEEs) offer a crucial mechanism for bolstering IoT device security. This paper delves deeply into the application of TEEs within the IoT ecosystem to protect...
Proceedings Article

Deep Learning Applications in Stroke Segmentation: Progress, Challenges, and Future Prospects

Xingyi Rong
Stroke is a major global health challenge, significantly contributing to disability and death worldwide. Due to the rapid progress of deep learning, the challenges in this field have the potential to be solved. This article offers a comprehensive examination of the uses of deep learning in stroke segmentation....
Proceedings Article

SV-UNet: Attention-based Fully Convolutional Network with Transfer Learning for Multimodal Infarct Segmentation

Han Xu
Ischemic stroke has a devastating impact on global health, causing both death and disability. Automatic, accurate segmentation of these stoke areas, or infarctions, from Magnetic Resonance Imaging (MRI), can aid clinicians in personalized therapeutic strategies. Recent advances in merging fully convolutional...
Proceedings Article

An Empirical Study on the Effect of Face Occupancy on the Generalization Performance of CNN Models

Jialin Tian
This empirical study investigated the impact of face occupancy on the generalization performance of Convolutional Neural Networks (CNNs), specifically focusing on three widely-used architectures: ResNet50, VGG16, and MobileNetV2. The face occupancy ratio, defined as the proportion of the image occupied...
Proceedings Article

Multiple Optimized Deep Learning Models for Effective Facial Expression

Ruoyu Li
Facial expression recognition is an essential domain within computer vision, focused on interpreting human emotions through facial cues for enhanced human-computer interaction. This study examines the current state and challenges in facial expression recognition, emphasizing the role of deep learning...
Proceedings Article

The Strategy of Generalization Ability Improvement for Brain Tumor Classification Based on CNNs Model

Yuze Hou
Brain tumor is a serious disease that affects lots of people. Traditional methods of tumor detection are time-consuming and subjective. Many studies have demonstrated Convolutional Neural Networks (CNNs) can classify brain tumors with a high accuracy, but they did not focus on the generalization of the...
Proceedings Article

Hyperparameter Optimization for Improving BERT-Based Irony Sentence Recognition

Renjian Hou
Irony is a figure of speech in which the words are employed with an intended meaning that differs from their literal meaning. The ability to recognize and interpret ironic sentences can prevent misunderstandings in conversations and enhance effective communication. With the continuous improvement of...
Proceedings Article

A Comparative Analysis of White Box and Gray Box Adversarial Attacks to Natural Language Processing Systems

Hua Feng, Shangyi Li, Haoyuan Shi, Zhixun Ye
This article comprehensively describes natural language processing (NLP) and its relationship to adversarial attacks. As an interdisciplinary field involving computer science, artificial intelligence, and linguistics, the NLP has great potential to transform all walks of life. Deep learning, as the main...
Proceedings Article

An Improved Convolutional Neural Network-Based Spam Recognition Model

Jinyuan Liu
Spam is one of the significant threats to cyber security by not only sending unwanted messages but also by potentially carrying viruses. Conventional spam detection methods, such as keyword matching and rule-based filtering, are less effective since spammers could advance their method to bypass these...
Proceedings Article

The Development and Analysis of 3D Feature Reconstruction Technology for Service Robot SLAM System in Restaurant Environment

Zibo Zheng
Indoor mobile robots are now widely used in restaurants for delivery services to improve delivery efficiency and reduce labor costs. Simultaneous visual localization and mapping (SLAM) and path planning are the basis for restaurant service robots to navigate and deliver food. Therefore, it is useful...
Proceedings Article

Effectiveness Evaluation of Black-Box Data Poisoning Attack on Machine Learning Models

Junjing Zhan, Zhongxing Zhang, Ke Zhou
With machine learning has been widely used in face recognition, natural speech processing, automatic driving and medical systems, attacks against machine learning are also accompanied, which may bring serious safety risks to biometric certification systems or automobiles. Incorrect classification of...
Proceedings Article

A Study of Sentence Similarity Based on the All-minilm-l6-v2 Model With “Same Semantics, Different Structure” After Fine Tuning

Chen Yin, Zixuan Zhang
Traditional natural language processing models often find it difficult to distinguish between sentences with “similar structure and different semantics” and sentences with “different structure and similar semantics”. Based on the all-MiniLM-L6-v2 and Bidirectional Encoder Representations from Transformers...
Proceedings Article

Analysis of Emoticon based on BERT model

Pengfei Dai, Chenhao Kong, Boxiang Zeng
With the widespread adoption of digital communication platforms, emojis have become an integral part of conveying subtle emotions and expressions within written content. This paper delves into the application of BERT and its foundational Transformer technology in processing texts enriched with emojis,...
Proceedings Article

A Study on Employment Problems and Sentiment Analysis of College Students Based on Bert-BiLSTM

Zihan Chen
In recent years, Chinese college students have generally faced social problems such as fierce competition for employment and rising youth unemployment. Sentiment analysis of college students’ employment attitudes helps them recognize the situation, accurately position themselves, and rationally arrange...
Proceedings Article

Image Stitching based on Feature Detection and Extraction: An Analysis

Nan Zhao
Image stitching is a popular research area in the fields of computer vision and computer graphics. The feature points of images provide crucial information for this process. The accurate extraction of these features is essential to minimize misalignment and defects in the final stitched image. This paper...
Proceedings Article

Improved Facial Mask-Based Adversarial Attack for Deep Face Recognition Models

Haoran Wang
This paper explores the enhancement of security and robustness in the field of facial recognition by investigating adversarial example attacks. The author not only introduces an advanced adversarial example generation technique by utilizing key facial landmarks, but also investigates universal mask-based...
Proceedings Article

Advancements in Deep Learning-Based Approaches for Enhancing Accuracy in Traffic Sign Recognition

Dazhi Qin, Junxiang Tang, Sicheng Yu
With the increasing complexity and diversity of traffic environments, accurate identification of traffic signs becomes a necessary aspect for the development of assisted driving and autonomous driving technologies. Traffic sign recognition approaches exploiting deep learning have demonstrated significant...
Proceedings Article

Research on Different Feature Matching Algorithms for Panoramic Image Stitching

Zhao Zhang
Panoramic image stitching technology has penetrated into every field of modern life. As an important part of the stitching process, image feature matching directly affects the quality and speed of the stitching. In this paper, photos taken in daily life are used for experiments, and the precision and...
Proceedings Article

Research on Cultural Relic Restoration and Digital Presentation Based on 3D Reconstruction MVS Algorithm: A Case Study of Mogao Grottoes’ Cave 285

Mengyao Gao
This paper delves into the realm of three-dimensional (3D) reconstruction technology, specifically examining the principles underlying Multi-View Stereo (MVS) techniques, encompassing pose calculation, dense reconstruction, surface reconstruction, and texture mapping. It scrutinizes the application of...
Proceedings Article

Image Stitching Quality Evaluation and Improvement Based on SIFT Features and RANSAC Algorithm

Jinsong Shen
Due to factors such as perspective and lighting, traditional stitching such as perspective and lighting algorithms find it difficult to achieve high-quality stitching results. Therefore, how to effectively improve the image stitching effect has become a hot research topic. The traditional image stitching...
Proceedings Article

Innovative Fusion of Transformer Models with SIFT for Superior Panorama Stitching

Zheng Xiang
In the field of image stitching, generating multiple panoramas from a large set of images is a challenging task. Traditional methods often require complex pairwise comparisons, leading to time-consuming operations that may affect accuracy and efficiency. To address this issue, this paper presents an...
Proceedings Article

Comparison and Application of Implementing Image Homographs in Computer Vision

Xingqi Qiu
In the field of computer vision, planar homography plays a pivotal role in our research process. The homography matrix is capable of performing a variety of functions such as image warping, stitching, and video stitching. Within the realm of epipolar-geometry, it enables the execution of numerous tasks,...
Proceedings Article

Improvement and Analysis of Panoramic Image Mosaic Technology Based on Mixed Scene

Yuhua Pei
Given how quickly augmented reality (AR) and virtual reality (VR) technologies are developing, panoramic image stitching technology is playing an increasingly important role in providing immersive experiences. Especially in complex scenes where natural and urban environments are interwoven, high-quality...
Proceedings Article

Application of Computer Vision and Machine Learning to Recognition of Rice Leaf Diseases

Pengshao Ye
As global population growth poses an increasing challenge to agriculture, the importance of crop pest management has increased. At present, most pest problems are solved by traditional manual methods, which are becoming increasingly inefficient in the face of increasing production capacity, so automated...
Proceedings Article

3D Reconstruction of Monocular Images based on ResNeXt Neural Network

Yu Zhang
With the rapid advancements in computer vision and image processing technologies, three-dimensional (3D) reconstruction from a single image has emerged as a significant area of research within the field of computer vision. However, due to the inherent lack of depth information in single images, 3D reconstruction...
Proceedings Article

Addressing Sentiment Classification in Short Text Comments Using BERT and LSTM

He Li
The prevalence of short text comments in the comment sections of social media platforms accelerates the rate of information dissemination. The diversity and unpredictability of comment content can affect the sentiments of viewers and their judgment of topics and interfere with social media platforms’...
Proceedings Article

Enhancing Emotion Recognition in Text Data Based on Bi-LSTM and Attention Approach

Zhuojun Lyu
Emotion recognition stands as a cornerstone across various domains, propelling the evolution of artificial intelligence. This paper introduces a pioneering approach to emotion recognition, employing a Bi-directional Long Short-Term Memory (Bi-LSTM) neural network fused with an attention mechanism (Att)....
Proceedings Article

Deep Learning-Based Pedestrian Detection and Analysis with YOLOv5

Xuchen Cui
Fueled by the swift advancements in artificial intelligence, computer vision technology has found extensive applications across various domains. This article will focus on how to use the You Only Look Once version 5 (YOLOv5) to enhance the accuracy and efficiency of pedestrian detection. It begins by...
Proceedings Article

Deep Convolutional Generative Adversarial Networks (DCGAN)-Based Anime Face Generation

Xunxiong Ou
This study delves into the realm of anime face generation with the aim of empowering individuals to create their own anime characters and easing the burden on artists. Employing Deep Convolutional Generative Adversarial Networks (DCGAN), the research focuses on generating anime face images. The DCGAN...
Proceedings Article

Enhancing Water Body Detection in Satellite Imagery Using U-Net Models

Jiongyi Li
Precise and efficient detection of water bodies in satellite pictures is essential for diverse applications, like environmental surveillance, urban development, and disaster response. This study investigates the effectiveness of utilizing the U-shaped network (U-Net) models with input shapes of 128x128...
Proceedings Article

The Influence of Multiple Loss Functions on MRI Stroke Lesion Area Segmentation

Ruihui Cao
The study solved the imbalanced Magnetic resonance imaging (MRI) dataset problem by choosing different loss functions to achieve a higher stroke lesion area segmentation accuracy. It is helpful for doctors to treat patients efficiently by segmenting the stroke areas quickly with the machine learning...
Proceedings Article

The Influence of Parameter Optimization of VGGNet on Model Performance in Terms of Classification Layers

Yizhen He
This paper aims to explore the effect of parameter adjustment in classification layers of VGGNet. It provides suitable amounts of parameters for VGGNet with FC and FCN layers, which are available for reference. In the research, FER13 dataset, which contains gray-scaled images with shape of 48 x 48 with...
Proceedings Article

Accurate Segmentation of Ischemic Stroke Lesion Areas Based on Pre-trained UNets

Zhewen Guo
Due to the mortality and disabilities caused by ischemic stroke, it is of great significance to provide accurate segmentation during the treatment of ischemic stroke. In this study, pre-trained UNets were utilized to save the computational resource and provide accurate prediction of lesion area caused...
Proceedings Article

Detection of Negative Emotions and Depression in Social Networks Based on Bert-LSTM Model

Chao Shen, Zhihao Zhao
Due to the surge of depression among netizens in China’s online society, the problem of social depression has developed seriously. The purpose of this paper is to detect and remind Internet negative emotions through natural language processing technology. In this paper, the Bidirectional Encoder Representations...
Proceedings Article

Fine-tuning Technologies for Reducing the FER Bias Across Various Distributions

Zhisong Liu
Lacking sufficient data has become a serious problem in the field of Facial Expression Recognition (FER), since the cost of collecting a large amount of facial expression images is huge and training a new FER model from the beginning is time-consuming. In this paper, the author trained a FER model based...
Proceedings Article

CDAE-R: Multifunctional End-to-End Model for Brain Abnormality Images Classification and Denoising

Zezhou Wang
Traditionally, medical image classification and denoising tasks are conducted and evaluated separately, which may waste computational resources and incur excessive expenses. Besides, the features extracted by different models cannot be shared and utilized effectively. Therefore, an end-to-end multimodal...
Proceedings Article

The Role of AI in Revolutionizing the Gaming Industry: A Focus on DLSS and Large Language Models

Haozhe Zhou
Artificial Intelligence (AI) has become a driver of innovation in a rapidly evolving technological landscape across a wide range of industries, and the gaming industry is at the forefront of these advances. The aim of this paper is to explore the wide range of applications and potential uses of AI in...
Proceedings Article

Research for Improving the Accuracy of Image Classification Based on Semi-Supervision

Ziyang Gu, Lihang Wang, Yueqian Zhang
One of the core tasks of computer vision is image classification, which aims to distinguish different types of images based on various features. However, traditional image classification methods often rely on a large amount of labeled data to support them and obtaining large-scale, high-quality labeled...
Proceedings Article

A Comprehensive Research of the Development of Classical Convolutional Neural Networks

Changli Tao
Since 2010, with the rapid emergence of deep learning, Convolutional Neural Networks (CNNs) have made significant progress across various domains. In particular, advancements in CNNs have profoundly impacted the field of computer vision, resulting in substantial improvements in tasks such as image classification,...
Proceedings Article

Research for Enhancing Processing and Computational Efficiency in LLM

Yu Cong
In the context of current technological development, large language models (LLMs) have become a core component of artificial intelligence. This report provides an in-depth discussion of various advanced strategies and techniques to improve the processing and computational efficiency of LLMs. First, the...
Proceedings Article

Research of Improved DETR Models and Transformer Applications in Computer Vision

Ruoyu Li
Researchers in the domain of computer vision have increasingly turned their attention towards harnessing the power of Transformer models for visual tasks. This paradigm shift has led to the emergence of pioneering models such as Detection Transformer (DETR) and Vision Transformer (ViT), opening up new...
Proceedings Article

Optimization in Facial Expression Recognition Based on CNN Combined with SE Modules

Xuanyu Zhang
Facial expression recognition has emerged as a pivotal aspect of human-computer interaction and psychological research, drawing extensive attention in computer vision. The essay aims to improve the facial expression recognition performance of Convolutional Neural Networks (CNN) under different imaging...
Proceedings Article

Enhancing Emotion Detection Through CNN-Based Facial Expression Recognition

Jinyang Wang
Artificial intelligence-based approaches, such as Convolutional Neural Networks (CNN), hold significant promise for emotion detection, particularly in facial expression recognition, offering invaluable insights for various sectors including business, medicine, and psychology. This paper explores the...
Proceedings Article

Improvements in GPipe Pipeline Parallel Acceleration: Choices, Constraints and Optimal Strategies of Micro-Batch

Riqian Hu
As the scale of deep learning models continues to grow, large-scale models in machine vision and natural language processing (NLP) have achieved tremendous success. For instance, the current NLP giant GPT-3 has pushed the parameter count to the scale of billions. However, due to the significant surpassing...
Proceedings Article

Layer-wise Interpretability Investigation of Facial Expression Recognition Models Based on Grad-CAM

Siyuan Yao
For a long time, artificial intelligence has faced the challenge of interpretability, with the black-box problem persistently troubling researchers. Although there have been studies using Gradient-weighted Class Activation Map (Grad-CAM) for interpretability in the field of facial expression recognition,...