MEDFUSION: A Multimodal Medical Diagnosis using Symptoms and Images

T. D. Venkatesh; R. Krishna Priya; R. S. Vignesh

doi:10.2991/978-94-6239-713-2_5

<Previous Article In Volume

Next Article In Volume>

MEDFUSION: A Multimodal Medical Diagnosis using Symptoms and Images

Authors

T. D. Venkatesh¹^{, *}, R. Krishna Priya¹, R. S. Vignesh¹

¹Hindustan University, Chennai, India

^*Corresponding author. Email: venky070403@gmail.com

Corresponding Author

T. D. Venkatesh

Available Online 25 June 2026.

DOI: 10.2991/978-94-6239-713-2_5 How to use a DOI?
Keywords: Multimodal fusion; medical diagnosis; symptom analysis; medical imaging; deep learning; MobileNetV2
Abstract: Multimodal artificial intelligence has emerged as an effective approach in medical diagnostics by integrating heterogeneous data sources such as clinical symptoms and medical images, thereby addressing the limitations of unimodal diagnostic systems [10], [15], [12]. The creation of the modular multimodal medical diagnostic framework, called MEDFUSION, is presented in this article. It combines deep learning-based medical image classification utilizing the confidence-weighted late fusion technique [2], [15] with the strength of structured symptom-based machine learning. The symptom-based component is trained on the benchmark dataset consisting of 4,920 samples and 132 binary symptoms mapped to 41 diseases using an ensemble of Random Forest, Naive Bayes, and Logistic Regression models [3]. With an average classification accuracy of 96%, the image analysis component employs an optimized MobileNetV2 architecture [11] that was trained and tested on 17 publicly available medical image datasets, whereby each dataset was processed individually using standardized image preprocessing methods [4], such as X-rays [9], CT [16], MRI, ultrasound, OCT [14], fundus, dermoscopy [12], endoscopy, and otoscopy images. The decision level fusion technique involves the fusion of probabilistic outcomes of both modalities [15], resulting in an overall diagnosis with increased robustness. Experimental evaluation demonstrates that the proposed multimodal diagnostic framework achieves an accuracy of up to 98% on selected benchmark datasets, indicating the effectiveness of integrating symptom-based prediction with image-based deep learning models [1], [5], outperforming the individual modalities on the multimodal test pair data set. The multimodal decision support framework is implemented as an interface using the Streamlit library, allowing for the interactive input of symptoms, image upload, and display of confidence values. The results show the potential of multimodal AI systems for decision support systems, increasing robustness and interpretability [17], [8].
Copyright: © 2026 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Conference on Advances in Computing Technology and Artificial Intelligence (COMPUTATIA 2026)
Series: Atlantis Highlights in Intelligent Systems
Publication Date: 25 June 2026
ISBN: 978-94-6239-713-2
ISSN: 2589-4919
DOI: 10.2991/978-94-6239-713-2_5 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - T. D. Venkatesh
AU  - R. Krishna Priya
AU  - R. S. Vignesh
PY  - 2026
DA  - 2026/06/25
TI  - MEDFUSION: A Multimodal Medical Diagnosis using Symptoms and Images
BT  - Proceedings of the International Conference on Advances in Computing Technology and Artificial Intelligence (COMPUTATIA 2026)
PB  - Atlantis Press
SP  - 66
EP  - 84
SN  - 2589-4919
UR  - https://doi.org/10.2991/978-94-6239-713-2_5
DO  - 10.2991/978-94-6239-713-2_5
ID  - Venkatesh2026
ER  -

download .riscopy to clipboard