Proceedings of the International Conference on Sustainable Computing and Artificial Intelligence (ICSCAI 2025)

Image-based News Aggregator Using OCR and NLP for Summarization

Authors
P. Santosh Reddy1, S. S. Sanjan1, Spandhana K. Devadiga1, *, Vinati Thakkar1
1Dept. of Computer Science and Engineering, BNMIT Institute of Technology, Bangalore, Karnataka, India
*Corresponding author. Email: spandhana050604@gmail.com
Corresponding Author
Spandhana K. Devadiga
Available Online 28 May 2026.
DOI
10.2991/978-94-6239-674-6_40How to use a DOI?
Keywords
OCR; NLP; Summarization; Image Processing; News Aggregator; Text Extraction; Tesseract; Transformers
Abstract

The development of digital information raises the demand for insight extraction from large data in the shortest possible time. Users suffer from inability to keep updated due to growing online news content and time limits. There-fore, this solution combines text extraction using OCR (Tesseract) in newspaper images, real-time news using NewsAPI.org, and abstractive summarization. This approach condenses articles into compact, easily understandable form. Sum-maries can be turned into audio using TTS tools like gTTS or pyttsx3 to increase accessibility. These combined technologies provide news faster, personalized, and easily digestible to the users without actually reading full articles.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Sustainable Computing and Artificial Intelligence (ICSCAI 2025)
Series
Advances in Engineering Research
Publication Date
28 May 2026
ISBN
978-94-6239-674-6
ISSN
2352-5401
DOI
10.2991/978-94-6239-674-6_40How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - P. Santosh Reddy
AU  - S. S. Sanjan
AU  - Spandhana K. Devadiga
AU  - Vinati Thakkar
PY  - 2026
DA  - 2026/05/28
TI  - Image-based News Aggregator Using OCR and NLP for Summarization
BT  - Proceedings of the International Conference on Sustainable Computing and Artificial Intelligence (ICSCAI 2025)
PB  - Atlantis Press
SP  - 489
EP  - 498
SN  - 2352-5401
UR  - https://doi.org/10.2991/978-94-6239-674-6_40
DO  - 10.2991/978-94-6239-674-6_40
ID  - Reddy2026
ER  -