Repository logo
 
Publication

Deep Learning to optimize viral vector production for human gene therapy

datacite.subject.fosEngenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informáticapt_PT
dc.contributor.advisorPesquita, Cátia Luísa Santana Calisto
dc.contributor.advisorRodrigues, Ana Filipa
dc.contributor.authorFerraz, João Lucas Figueiredo
dc.date.accessioned2025-04-04T15:08:40Z
dc.date.available2025-04-04T15:08:40Z
dc.date.issued2025
dc.date.submitted2025
dc.descriptionTese de Mestrado, Engenharia Informática, 2025, Universidade de Lisboa, Faculdade de Ciênciaspt_PT
dc.description.abstractThis work explores the potential of Protein Language Models (PLMs) to advance the design of novel Adeno-Associated Virus 2 (AAV2) sequences, while focusing on two primary objectives: sequence classification and generative design. For the classification task, we fine-tuned a PLM (ProtBERT) to accurately differentiate between viable and non-viable AAV2 sequences. Results demonstrated high classification performance across multiple trained models, validating the hypothesis that domain-specific fine-tuning enables PLMs to effectively capture important AAV2 sequence features. For sequence generation, we fine-tuned a conditional generative PLM (ProGen) to design viable AAV2 capsid protein sequences. While the model generated structurally diverse sequences, extensive evaluations indicated that additional refinements are necessary to consistently align with viability criteria. The classification model highlights the potential of PLMs in predicting sequence viability, offering a reliable approach that could help reduce experimental costs. We consider that the generative approach, though requiring further optimization, introduces a novel avenue for designing diverse AAV2 variants. Future efforts will focus on refining the generative framework by incorporating explicit viability tags, classifier feedback, and more extensive generation hyperparameter testing, as well as expanding its application to additional AAV2 properties. This work lays a foundation for leveraging PLMs in AAV2 sequence engineering, offering promising prospects for the use of Language Models for viral vector design.pt_PT
dc.identifier.urihttp://hdl.handle.net/10400.5/100019
dc.language.isoengpt_PT
dc.subjectAprendizagem profundapt_PT
dc.subjectModelos de linguagem de proteínaspt_PT
dc.subjectAprendizagem por transferênciapt_PT
dc.subjectRepresentaçõespt_PT
dc.subjectInvestigação de sequências proteicaspt_PT
dc.subjectTeses de mestrado - 2025pt_PT
dc.titleDeep Learning to optimize viral vector production for human gene therapypt_PT
dc.typemaster thesis
dspace.entity.typePublication
rcaap.rightsopenAccesspt_PT
rcaap.typemasterThesispt_PT
thesis.degree.nameMestrado em Engenharia Informáticapt_PT

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TM_João_Ferraz.pdf
Size:
1.79 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.2 KB
Format:
Item-specific license agreed upon to submission
Description: