LLM fine-tuning with biomedical open-source data

Anaya,Christopher

Publicação

LLM fine-tuning with biomedical open-source data

2025Dissertação de mestrado

dc.contributor.advisor	Couto,Francisco José Moreira
dc.contributor.advisor	Fernandes,Maria Isabel Mou Sequeira
dc.contributor.author	Anaya,Christopher
dc.contributor.institution	Faculty of Sciences
dc.contributor.institution	Department of Informatics
dc.date.accessioned	2026-01-15T18:35:01Z
dc.date.available	2026-01-15T18:35:01Z
dc.date.issued	2025
dc.description	Tese de Mestrado, Ciência de Dados, 2025, Universidade de Lisboa, Faculdade de Ciências
dc.description.abstract	Biomedical question answering (QA) systems aim to support researchers and clinicians by providing accurate, context-aware answers to complex information needs. Recent advances in large language models (LLMs) have significantly improved QA performance across domains, yet challenges remain in the biomedical domain due to terminology complexity, limited data availability, and the risk of generating hallucinated content. This thesis investigates the application of parameter-efficient fine-tuning techniques to adapt LLMs for biomedical QA, focusing on the BioASQ challenge Task B Phase B, which includes yes/no, factoid, list, and ideal questions. A comprehensive review of biomedical QA datasets and LLM adaptations highlights the evolving landscape of knowledge-infused models. The thesis presents a fine-tuning pipeline based on QLoRA, a memory-efficient method for adapting the Mistral-7B-Instruct-v0.1 model using quantized weights. Domain-specific prompt templates were designed for each question type to optimize answer formatting and reduce hallucinations. The experimental setup included training on a curated dataset comprising the training dataset provided by BioASQ, Gene Ontology, DrugBank, and BiQA-derived examples. Results show that the proposed system achieves competitive performance across question types, particularly for yes/no questions, attaining F1 scores of 0.76, where structured JSON outputs enabled reliable automatic evaluation. For ideal (free-text) questions, the system demonstrated fluent but occasionally speculative responses, highlighting the trade-offs between informativeness and factual grounding. Evaluation metrics such as F1, MRR, and ROUGE were complemented by qualitative error analysis to assess system robustness. The study concludes that combining domain-adapted prompts with QLoRA fine-tuning offers a promising approach for deploying efficient and effective biomedical QA systems. Future work should explore retrieval-augmented generation, deeper integration of biomedical ontologies, and improved evaluation frameworks tailored to the nuances of clinical and research settings.	en
dc.format	application/pdf
dc.identifier.tid	204174821
dc.identifier.uri	http://hdl.handle.net/10400.5/116644
dc.language.iso	eng
dc.subject	Biomedical Question Answering
dc.subject	Large Language Models
dc.subject	Parameter-Efficient Fine-Tuning
dc.subject	BioASQ
dc.subject	Evaluation Metrics
dc.title	LLM fine-tuning with biomedical open-source data	en
dc.type	master thesis
dspace.entity.type	Publication
rcaap.rights	openAccess

Ficheiros

Principais

A mostrar 1 - 1 de 1

Nome:: TM_Christopher_Anaya.pdf
Tamanho:: 322.64 KB
Formato:: Adobe Portable Document Format

Ver/Abrir

Coleções

Pure > Dspace
PURE > Dspace - Faculdade de Ciências