Proceedings of the 2024 2nd International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2024)
Session: Computer Vision, NLP, and AI Innovations
49 articles
Proceedings Article
Comparison of Adversarial Robustness of Convolutional Neural Networks for Handwritten Digit Recognition
Zhen Ren
Machine learning has found widespread application in contemporary society, yet it remains vulnerable to the corrosive effects of adversarial samples. These refer to input data that has been deliberately modified in a certain way to mislead machine learning models. While these modifications may be undetectable...
Proceedings Article
Research on speech recognition and its application in language disorders
Kangbo Wei
Since ancient times, language has been a fundamental medium for human communication and the expression of thoughts. The advancement of speech recognition technology has significantly enhanced the efficiency of generating, transmitting, storing, and acquiring speech information, thereby facilitating more...
Proceedings Article
Enhancing Image Segmentation for ICH through Transfer Learning from Stroke MRI to ICH CT
Lianghan Dong
A serious brain condition with a high death rate, intracranial hemorrhage (ICH) requires a precise and timely diagnosis. While Computed Tomography (CT) is commonly used for its speed and accessibility, its diagnostic accuracy is limited compared to Magnetic Resonance Imaging (MRI). However, the latter...
Proceedings Article
The Investigation and Discussion Related to Recommendation Systems in Video Social Platforms
Rongxuan Zhang
With the increasing popularity of video social platforms, recommendation systems play a crucial role on these platforms. They can recommend content of interest to users based on their interests and preferences, greatly improving their content browsing experience. This article provides an in-depth analysis...
Proceedings Article
Enhancing IoT Security Through Trusted Execution Environments
Zheng Zhang
As the Internet of Things (IoT) proliferates, securing these interconnected devices has become a critical concern. Trusted Execution Environments (TEEs) offer a crucial mechanism for bolstering IoT device security. This paper delves deeply into the application of TEEs within the IoT ecosystem to protect...
Proceedings Article
Deep Learning Applications in Stroke Segmentation: Progress, Challenges, and Future Prospects
Xingyi Rong
Stroke is a major global health challenge, significantly contributing to disability and death worldwide. Due to the rapid progress of deep learning, the challenges in this field have the potential to be solved. This article offers a comprehensive examination of the uses of deep learning in stroke segmentation....
Proceedings Article
SV-UNet: Attention-based Fully Convolutional Network with Transfer Learning for Multimodal Infarct Segmentation
Han Xu
Ischemic stroke has a devastating impact on global health, causing both death and disability. Automatic, accurate segmentation of these stoke areas, or infarctions, from Magnetic Resonance Imaging (MRI), can aid clinicians in personalized therapeutic strategies. Recent advances in merging fully convolutional...
Proceedings Article
An Empirical Study on the Effect of Face Occupancy on the Generalization Performance of CNN Models
Jialin Tian
This empirical study investigated the impact of face occupancy on the generalization performance of Convolutional Neural Networks (CNNs), specifically focusing on three widely-used architectures: ResNet50, VGG16, and MobileNetV2. The face occupancy ratio, defined as the proportion of the image occupied...
Proceedings Article
Multiple Optimized Deep Learning Models for Effective Facial Expression
Ruoyu Li
Facial expression recognition is an essential domain within computer vision, focused on interpreting human emotions through facial cues for enhanced human-computer interaction. This study examines the current state and challenges in facial expression recognition, emphasizing the role of deep learning...
Proceedings Article
The Strategy of Generalization Ability Improvement for Brain Tumor Classification Based on CNNs Model
Yuze Hou
Brain tumor is a serious disease that affects lots of people. Traditional methods of tumor detection are time-consuming and subjective. Many studies have demonstrated Convolutional Neural Networks (CNNs) can classify brain tumors with a high accuracy, but they did not focus on the generalization of the...
Proceedings Article
Hyperparameter Optimization for Improving BERT-Based Irony Sentence Recognition
Renjian Hou
Irony is a figure of speech in which the words are employed with an intended meaning that differs from their literal meaning. The ability to recognize and interpret ironic sentences can prevent misunderstandings in conversations and enhance effective communication. With the continuous improvement of...
Proceedings Article
A Comparative Analysis of White Box and Gray Box Adversarial Attacks to Natural Language Processing Systems
Hua Feng, Shangyi Li, Haoyuan Shi, Zhixun Ye
This article comprehensively describes natural language processing (NLP) and its relationship to adversarial attacks. As an interdisciplinary field involving computer science, artificial intelligence, and linguistics, the NLP has great potential to transform all walks of life. Deep learning, as the main...
Proceedings Article
An Improved Convolutional Neural Network-Based Spam Recognition Model
Jinyuan Liu
Spam is one of the significant threats to cyber security by not only sending unwanted messages but also by potentially carrying viruses. Conventional spam detection methods, such as keyword matching and rule-based filtering, are less effective since spammers could advance their method to bypass these...
Proceedings Article
The Development and Analysis of 3D Feature Reconstruction Technology for Service Robot SLAM System in Restaurant Environment
Zibo Zheng
Indoor mobile robots are now widely used in restaurants for delivery services to improve delivery efficiency and reduce labor costs. Simultaneous visual localization and mapping (SLAM) and path planning are the basis for restaurant service robots to navigate and deliver food. Therefore, it is useful...
Proceedings Article
Effectiveness Evaluation of Black-Box Data Poisoning Attack on Machine Learning Models
Junjing Zhan, Zhongxing Zhang, Ke Zhou
With machine learning has been widely used in face recognition, natural speech processing, automatic driving and medical systems, attacks against machine learning are also accompanied, which may bring serious safety risks to biometric certification systems or automobiles. Incorrect classification of...
Proceedings Article
A Study of Sentence Similarity Based on the All-minilm-l6-v2 Model With “Same Semantics, Different Structure” After Fine Tuning
Chen Yin, Zixuan Zhang
Traditional natural language processing models often find it difficult to distinguish between sentences with “similar structure and different semantics” and sentences with “different structure and similar semantics”. Based on the all-MiniLM-L6-v2 and Bidirectional Encoder Representations from Transformers...
Proceedings Article
Analysis of Emoticon based on BERT model
Pengfei Dai, Chenhao Kong, Boxiang Zeng
With the widespread adoption of digital communication platforms, emojis have become an integral part of conveying subtle emotions and expressions within written content. This paper delves into the application of BERT and its foundational Transformer technology in processing texts enriched with emojis,...
Proceedings Article
A Study on Employment Problems and Sentiment Analysis of College Students Based on Bert-BiLSTM
Zihan Chen
In recent years, Chinese college students have generally faced social problems such as fierce competition for employment and rising youth unemployment. Sentiment analysis of college students’ employment attitudes helps them recognize the situation, accurately position themselves, and rationally arrange...
Proceedings Article
Image Stitching based on Feature Detection and Extraction: An Analysis
Nan Zhao
Image stitching is a popular research area in the fields of computer vision and computer graphics. The feature points of images provide crucial information for this process. The accurate extraction of these features is essential to minimize misalignment and defects in the final stitched image. This paper...
Proceedings Article
Improved Facial Mask-Based Adversarial Attack for Deep Face Recognition Models
Haoran Wang
This paper explores the enhancement of security and robustness in the field of facial recognition by investigating adversarial example attacks. The author not only introduces an advanced adversarial example generation technique by utilizing key facial landmarks, but also investigates universal mask-based...
Proceedings Article
Advancements in Deep Learning-Based Approaches for Enhancing Accuracy in Traffic Sign Recognition
Dazhi Qin, Junxiang Tang, Sicheng Yu
With the increasing complexity and diversity of traffic environments, accurate identification of traffic signs becomes a necessary aspect for the development of assisted driving and autonomous driving technologies. Traffic sign recognition approaches exploiting deep learning have demonstrated significant...
Proceedings Article
Research on Different Feature Matching Algorithms for Panoramic Image Stitching
Zhao Zhang
Panoramic image stitching technology has penetrated into every field of modern life. As an important part of the stitching process, image feature matching directly affects the quality and speed of the stitching. In this paper, photos taken in daily life are used for experiments, and the precision and...
Proceedings Article
Research on Cultural Relic Restoration and Digital Presentation Based on 3D Reconstruction MVS Algorithm: A Case Study of Mogao Grottoes’ Cave 285
Mengyao Gao
This paper delves into the realm of three-dimensional (3D) reconstruction technology, specifically examining the principles underlying Multi-View Stereo (MVS) techniques, encompassing pose calculation, dense reconstruction, surface reconstruction, and texture mapping. It scrutinizes the application of...
Proceedings Article
Image Stitching Quality Evaluation and Improvement Based on SIFT Features and RANSAC Algorithm
Jinsong Shen
Due to factors such as perspective and lighting, traditional stitching such as perspective and lighting algorithms find it difficult to achieve high-quality stitching results. Therefore, how to effectively improve the image stitching effect has become a hot research topic. The traditional image stitching...
Proceedings Article
Innovative Fusion of Transformer Models with SIFT for Superior Panorama Stitching
Zheng Xiang
In the field of image stitching, generating multiple panoramas from a large set of images is a challenging task. Traditional methods often require complex pairwise comparisons, leading to time-consuming operations that may affect accuracy and efficiency. To address this issue, this paper presents an...
Proceedings Article
Comparison and Application of Implementing Image Homographs in Computer Vision
Xingqi Qiu
In the field of computer vision, planar homography plays a pivotal role in our research process. The homography matrix is capable of performing a variety of functions such as image warping, stitching, and video stitching. Within the realm of epipolar-geometry, it enables the execution of numerous tasks,...
Proceedings Article
Improvement and Analysis of Panoramic Image Mosaic Technology Based on Mixed Scene
Yuhua Pei
Given how quickly augmented reality (AR) and virtual reality (VR) technologies are developing, panoramic image stitching technology is playing an increasingly important role in providing immersive experiences. Especially in complex scenes where natural and urban environments are interwoven, high-quality...
Proceedings Article
Application of Computer Vision and Machine Learning to Recognition of Rice Leaf Diseases
Pengshao Ye
As global population growth poses an increasing challenge to agriculture, the importance of crop pest management has increased. At present, most pest problems are solved by traditional manual methods, which are becoming increasingly inefficient in the face of increasing production capacity, so automated...
Proceedings Article
3D Reconstruction of Monocular Images based on ResNeXt Neural Network
Yu Zhang
With the rapid advancements in computer vision and image processing technologies, three-dimensional (3D) reconstruction from a single image has emerged as a significant area of research within the field of computer vision. However, due to the inherent lack of depth information in single images, 3D reconstruction...
Proceedings Article
Addressing Sentiment Classification in Short Text Comments Using BERT and LSTM
He Li
The prevalence of short text comments in the comment sections of social media platforms accelerates the rate of information dissemination. The diversity and unpredictability of comment content can affect the sentiments of viewers and their judgment of topics and interfere with social media platforms’...
Proceedings Article
Enhancing Emotion Recognition in Text Data Based on Bi-LSTM and Attention Approach
Zhuojun Lyu
Emotion recognition stands as a cornerstone across various domains, propelling the evolution of artificial intelligence. This paper introduces a pioneering approach to emotion recognition, employing a Bi-directional Long Short-Term Memory (Bi-LSTM) neural network fused with an attention mechanism (Att)....
Proceedings Article
Deep Learning-Based Pedestrian Detection and Analysis with YOLOv5
Xuchen Cui
Fueled by the swift advancements in artificial intelligence, computer vision technology has found extensive applications across various domains. This article will focus on how to use the You Only Look Once version 5 (YOLOv5) to enhance the accuracy and efficiency of pedestrian detection. It begins by...
Proceedings Article
Deep Convolutional Generative Adversarial Networks (DCGAN)-Based Anime Face Generation
Xunxiong Ou
This study delves into the realm of anime face generation with the aim of empowering individuals to create their own anime characters and easing the burden on artists. Employing Deep Convolutional Generative Adversarial Networks (DCGAN), the research focuses on generating anime face images. The DCGAN...
Proceedings Article
Enhancing Water Body Detection in Satellite Imagery Using U-Net Models
Jiongyi Li
Precise and efficient detection of water bodies in satellite pictures is essential for diverse applications, like environmental surveillance, urban development, and disaster response. This study investigates the effectiveness of utilizing the U-shaped network (U-Net) models with input shapes of 128x128...
Proceedings Article
The Influence of Multiple Loss Functions on MRI Stroke Lesion Area Segmentation
Ruihui Cao
The study solved the imbalanced Magnetic resonance imaging (MRI) dataset problem by choosing different loss functions to achieve a higher stroke lesion area segmentation accuracy. It is helpful for doctors to treat patients efficiently by segmenting the stroke areas quickly with the machine learning...
Proceedings Article
The Influence of Parameter Optimization of VGGNet on Model Performance in Terms of Classification Layers
Yizhen He
This paper aims to explore the effect of parameter adjustment in classification layers of VGGNet. It provides suitable amounts of parameters for VGGNet with FC and FCN layers, which are available for reference. In the research, FER13 dataset, which contains gray-scaled images with shape of 48 x 48 with...
Proceedings Article
Accurate Segmentation of Ischemic Stroke Lesion Areas Based on Pre-trained UNets
Zhewen Guo
Due to the mortality and disabilities caused by ischemic stroke, it is of great significance to provide accurate segmentation during the treatment of ischemic stroke. In this study, pre-trained UNets were utilized to save the computational resource and provide accurate prediction of lesion area caused...
Proceedings Article
Detection of Negative Emotions and Depression in Social Networks Based on Bert-LSTM Model
Chao Shen, Zhihao Zhao
Due to the surge of depression among netizens in China’s online society, the problem of social depression has developed seriously. The purpose of this paper is to detect and remind Internet negative emotions through natural language processing technology. In this paper, the Bidirectional Encoder Representations...
Proceedings Article
Fine-tuning Technologies for Reducing the FER Bias Across Various Distributions
Zhisong Liu
Lacking sufficient data has become a serious problem in the field of Facial Expression Recognition (FER), since the cost of collecting a large amount of facial expression images is huge and training a new FER model from the beginning is time-consuming. In this paper, the author trained a FER model based...
Proceedings Article
CDAE-R: Multifunctional End-to-End Model for Brain Abnormality Images Classification and Denoising
Zezhou Wang
Traditionally, medical image classification and denoising tasks are conducted and evaluated separately, which may waste computational resources and incur excessive expenses. Besides, the features extracted by different models cannot be shared and utilized effectively. Therefore, an end-to-end multimodal...
Proceedings Article
The Role of AI in Revolutionizing the Gaming Industry: A Focus on DLSS and Large Language Models
Haozhe Zhou
Artificial Intelligence (AI) has become a driver of innovation in a rapidly evolving technological landscape across a wide range of industries, and the gaming industry is at the forefront of these advances. The aim of this paper is to explore the wide range of applications and potential uses of AI in...
Proceedings Article
Research for Improving the Accuracy of Image Classification Based on Semi-Supervision
Ziyang Gu, Lihang Wang, Yueqian Zhang
One of the core tasks of computer vision is image classification, which aims to distinguish different types of images based on various features. However, traditional image classification methods often rely on a large amount of labeled data to support them and obtaining large-scale, high-quality labeled...
Proceedings Article
A Comprehensive Research of the Development of Classical Convolutional Neural Networks
Changli Tao
Since 2010, with the rapid emergence of deep learning, Convolutional Neural Networks (CNNs) have made significant progress across various domains. In particular, advancements in CNNs have profoundly impacted the field of computer vision, resulting in substantial improvements in tasks such as image classification,...
Proceedings Article
Research for Enhancing Processing and Computational Efficiency in LLM
Yu Cong
In the context of current technological development, large language models (LLMs) have become a core component of artificial intelligence. This report provides an in-depth discussion of various advanced strategies and techniques to improve the processing and computational efficiency of LLMs. First, the...
Proceedings Article
Research of Improved DETR Models and Transformer Applications in Computer Vision
Ruoyu Li
Researchers in the domain of computer vision have increasingly turned their attention towards harnessing the power of Transformer models for visual tasks. This paradigm shift has led to the emergence of pioneering models such as Detection Transformer (DETR) and Vision Transformer (ViT), opening up new...
Proceedings Article
Optimization in Facial Expression Recognition Based on CNN Combined with SE Modules
Xuanyu Zhang
Facial expression recognition has emerged as a pivotal aspect of human-computer interaction and psychological research, drawing extensive attention in computer vision. The essay aims to improve the facial expression recognition performance of Convolutional Neural Networks (CNN) under different imaging...
Proceedings Article
Enhancing Emotion Detection Through CNN-Based Facial Expression Recognition
Jinyang Wang
Artificial intelligence-based approaches, such as Convolutional Neural Networks (CNN), hold significant promise for emotion detection, particularly in facial expression recognition, offering invaluable insights for various sectors including business, medicine, and psychology. This paper explores the...
Proceedings Article
Improvements in GPipe Pipeline Parallel Acceleration: Choices, Constraints and Optimal Strategies of Micro-Batch
Riqian Hu
As the scale of deep learning models continues to grow, large-scale models in machine vision and natural language processing (NLP) have achieved tremendous success. For instance, the current NLP giant GPT-3 has pushed the parameter count to the scale of billions. However, due to the significant surpassing...
Proceedings Article
Layer-wise Interpretability Investigation of Facial Expression Recognition Models Based on Grad-CAM
Siyuan Yao
For a long time, artificial intelligence has faced the challenge of interpretability, with the black-box problem persistently troubling researchers. Although there have been studies using Gradient-weighted Class Activation Map (Grad-CAM) for interpretability in the field of facial expression recognition,...