Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)

Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions

Authors
Yuying Zhao1, *
1Institute of Education, University College London, London, WC1E 6BT, UK
*Corresponding author. Email: Yuying.zhao.21@ucl.ac.uk
Corresponding Author
Yuying Zhao
Available Online 24 April 2026.
DOI
10.2991/978-94-6239-648-7_99How to use a DOI?
Keywords
Natural Language Processing; Sentiment Analysis; Chinese Weibo; TF-IDF; BERT
Abstract

This study presents a systematic comparison between traditional sparse representations and contextualised Transformer models for Chinese Weibo sentiment classification. Using a publicly available dataset of 10,500 annotated microblog posts, the analysis examines four key dimensions: text representation, data scale, noise robustness, and fine-tuning strategy. A character-level TF-IDF + Logistic Regression baseline is evaluated alongside a pretrained BERT model under controlled experimental conditions. Results show that BERT substantially outperforms TF-IDF when trained on the full dataset, achieving higher accuracy and macro-F1 through its ability to capture contextual and semantic nuances in noisy social-media text. However, in low-resource settings with only 1,000 training samples, TF-IDF remains competitive, narrowing the performance gap and demonstrating strong efficiency under data scarcity. Noise-robustness experiments further reveal that BERT maintains stable or improved performance under mild perturbations, while TF-IDF exhibits gradual degradation. Fine-tuning analysis confirms that full parameter updates are essential for BERT’s effectiveness, as freezing the encoder leads to significant performance declines. Overall, the findings provide a reproducible benchmark and practical guidance for selecting sentiment-analysis models under varying resource constraints, highlighting trade-offs between expressive power, robustness, and computational cost.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
Series
Advances in Computer Science Research
Publication Date
24 April 2026
ISBN
978-94-6239-648-7
ISSN
2352-538X
DOI
10.2991/978-94-6239-648-7_99How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Yuying Zhao
PY  - 2026
DA  - 2026/04/24
TI  - Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions
BT  - Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
PB  - Atlantis Press
SP  - 923
EP  - 933
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6239-648-7_99
DO  - 10.2991/978-94-6239-648-7_99
ID  - Zhao2026
ER  -