| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 483.1 KB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
The ontology matching process focuses on discovering mappings between two concepts from distinct
ontologies, a source and a target. It is a fundamental step when trying to integrate heterogeneous data
sources that are described in ontologies. This data represents an even more challenging problem since
we are working with complex data as biomedical data. Thus, derived from the necessity of keeping on
improving ontology matching techniques, this dissertation focused on implementing a new approach to
the AML pipeline to calculate similarities between entities from two distinct ontologies.
For the implementation of this dissertation, we used some of the OAEI tracks, such as Anatomy
and LargeBio, to apply a new algorithm and evaluate if it improves AML’s results against a refer ence alignment. This new approach consisted of using pre-trained word embeddings of five different
types, BioWordVec Extrinsic, BioWordVec Intrinsic, PubMed+PC, PubMed+PC+Wikipedia and English
Wikipedia. These pre-trained word embeddings use a machine learning technique, Word2Vec, and were
used in this work since it allows to carry the semantic meaning inherent to the words represented with
the corresponding vector. Word embeddings allowed that each concept of each ontology was represented
with a corresponding vector to see if, with that information, it was possible to improve how relations
between concepts were determined in the AML system. The similarity between concepts was calculated
through the cosine distance and the evaluation of the new alignment used the metrics precision recall
and F-measure. Although we could not prove that word embeddings improve AML current results, this
implementation could be refined, and the technique can be still an option to consider in future work if
applied in some other way.
Descrição
Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2022
Palavras-chave
Embeddings de Palavras Alinhamento de Ontologias Ontologias Biomédicas Teses de mestrado - 2022
