| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 2.92 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
Classical Semantic Similarity Measures did not consider negative annotations in similarity compu tation, and the impact that these annotations can have in this data mining technique is not well studied.
As such, this work aims to understand how the addition of negative annotations impacts semantic sim ilarity. To do so, two pairwise similarity measures, Best-Match Average and Resnik, were adapted to
create the polar measures PolarBMA and PolarResnik. These were evaluated in two currently relevant
scopes: protein-protein interaction prediction and disease prediction against the original measures. Pairs
of proteins where the proteins were known to interact or not were taken from STRING and enriched with
positive and negative annotations from the Gene Ontology. Synthetic patients were created as sets of
annotations taken from the Mendelian diseases they were designed to have, as well as possible noise or
imprecise annotations. Then semantic similarity was computed with both polar and non-polar measures
between proteins in pairs and between patients and candidate diseases including the Mendelian diseases,
as well as random diseases taken from the Human Phenotype Ontology.
To evaluate if the polar measures performed well in comparison to the baseline, a ranking according
to semantic similarity was made for each measure and scope for evaluation and the rank cumulative
frequencies were plotted. ROC AUC and Precision-Recall curves were also determined for the Protein Protein interaction(PPI) prediction, as well as average precision for the disease prediction dataset. In
PPI prediction, polar measures had an increased performance in the Molecular Function branch for both
experiments where negative annotations were added and also in one of the experiments with the Cellular
Component branch. In the disease prediction scope, polar measures had an improved performance of
approximately ten percent. This improvement was verified in all disease prediction experiments, even
with the addition of noise and imprecision. Considering the results obtained, this work concludes that
negative annotations have an impact on semantic similarity, but the amplitude of this impact requires
further study.
Descrição
Tese de mestrado, Bioinformática e Biologia Computacional, Universidade de Lisboa, Faculdade de Ciências, 2022
Palavras-chave
Semelhança Semântica Ontologia biomédica Anotação negativa Previsão Interação Proteína-Proteína Previsão de doença Teses de mestrado - 2022
