Name: | Description: | Size: | Format: | |
---|---|---|---|---|
691.52 KB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
Proteins are a primary component in various biological processes, and most of them do not
function alone, needing to interact with other proteins to complete their function. The discovery
of new Protein-Protein Interaction (PPI) is an important task that could lead to new scientific developments. PPI are costly to obtain through experimental methods. Computational methods were
developed to overcome that problem. However, these computational methods come with uncertainty in their prediction. There are multiple ways to discover PPI, so the information gathered is
stored in databases. Still, only the positive outcomes are usually stored, making it necessary to use
computational methods to generate the negative pairs.
Typically, Machine Learning algorithms do not implement label uncertainty, and there is a
need to have negative samples for a precise prediction. This project explores how using already
known techniques of filtering the data and injecting the uncertainty into the machine learning
model affects PPI prediction. Also, it investigates the possible strategies to generate negative samples for this problem. Results prove that the use of confidence score is crucial for PPI prediction.
Description
Tese de Mestrado, Engenharia Informática, 2024, Universidade de Lisboa, Faculdade de Ciências
Keywords
Aprendizagem automática Incerteza do rótulo Interação proteína-proteína Teses de mestrado - 2024