Learning with uncertainty : improving supervised learning of protein-protein interactions with lower quality examples

Mendes, André dos Santos

http://hdl.handle.net/10400.5/97145

Use this identifier to reference this record.

Name:	Description:	Size:	Format:
TM_André_Mendes.pdf		691.52 KB	Adobe PDF	Download

Send Feedback

Authors

Mendes, André dos Santos

Advisor(s)

Pesquita, Cátia

Abstract(s)

Proteins are a primary component in various biological processes, and most of them do not function alone, needing to interact with other proteins to complete their function. The discovery of new Protein-Protein Interaction (PPI) is an important task that could lead to new scientific developments. PPI are costly to obtain through experimental methods. Computational methods were developed to overcome that problem. However, these computational methods come with uncertainty in their prediction. There are multiple ways to discover PPI, so the information gathered is stored in databases. Still, only the positive outcomes are usually stored, making it necessary to use computational methods to generate the negative pairs. Typically, Machine Learning algorithms do not implement label uncertainty, and there is a need to have negative samples for a precise prediction. This project explores how using already known techniques of filtering the data and injecting the uncertainty into the machine learning model affects PPI prediction. Also, it investigates the possible strategies to generate negative samples for this problem. Results prove that the use of confidence score is crucial for PPI prediction.