Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)

Precision Fine-Tuning: Leveraging LoRA for Text-Only Adaptation in Multi-Modal Medical Models

Authors
Wenru Lu1, *
1School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ, UK
*Corresponding author. Email: wl23796@essex.ac.uk
Corresponding Author
Wenru Lu
Available Online 24 April 2026.
DOI
10.2991/978-94-6239-648-7_91How to use a DOI?
Keywords
Large Multi-Modal Model; Low-Rank; Adaptation; Parameter; Efficient Fine-Tuning; Medical Domain Adaptation; Precision Fine-Tuning
Abstract

Large Multimodal Model (LMM), which has the ability to process visual and textual information, has great potential in medical and other professional fields. However, adapting these complex models to specific sub domains or tasks faces many challenges. Due to the high demand for computer resources and the risk of destroying the pre-training model, it is often difficult to achieve full fine-tuning. This paper proposes a new “precision fine tuning” method, which uses Low-Rank Adaptation (LoRA) technology to achieve efficient and directional model adaptation. This technology only applies LoRA to the text decoder layer inside the medgemma multimodal model, which can avoid and do not change the visual encoder. This project is based on a small and carefully selected data set of 98 Medical Abstracts. The experimental results show that the model training process converges stably, the training loss is significantly reduced, and the text generation component can be successfully adapted. At the same time, the qualitative evaluation also shows that the coherence, relevance and the use of language in the professional field of the generated text have been significantly improved. This method only updates a small part of the total model parameters, and significantly improves the parameter efficiency compared with the total fine-tuning. This study confirmed that directional LoRA is a potential technology, which can improve the ability of multimodal medical model text generation in a directional and efficient way.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
Series
Advances in Computer Science Research
Publication Date
24 April 2026
ISBN
978-94-6239-648-7
ISSN
2352-538X
DOI
10.2991/978-94-6239-648-7_91How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Wenru Lu
PY  - 2026
DA  - 2026/04/24
TI  - Precision Fine-Tuning: Leveraging LoRA for Text-Only Adaptation in Multi-Modal Medical Models
BT  - Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
PB  - Atlantis Press
SP  - 843
EP  - 852
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6239-648-7_91
DO  - 10.2991/978-94-6239-648-7_91
ID  - Lu2026
ER  -