Repository logo
 
No Thumbnail Available
Publication

Triclustering three-way temporal and heterogeneous data

Use this identifier to reference this record.
Name:Description:Size:Format: 
scnd990026354742207_td_Diogo_Soares.pdf24.06 MBAdobe PDF Download

Abstract(s)

Triclustering, targeting the discovery of coherent subspaces within three-way data, is becoming increasingly relevant in data science, especially for pattern discovery and knowledge acquisition from complex datasets in the biomedical field. This technique can reveal hidden patterns such as putative regulatory modules, disease progression profiles, and individuals with coherent behaviors. When applied to labeled data, triclustering aids in class differentiation and supports real-world decision-making. However, learning from 3W biomedical data is typically challenged by the rich temporal and heterogeneous nature, having mixed-type features and different structure compositions. In response to these challenges, this thesis establishes the foundations for pattern-centric 3W data analysis, focusing on triclustering for temporal and heterogeneous three-way data, targeting both descriptive and predictive tasks. In this context, this thesis includes six major contributions. It provides a literature review and comparative study of current triclustering algorithms for temporal data, highlighting the strengths and weaknesses of existing methods. It presents new tools to support the development and assessment of pattern discovery approaches in descriptive and predictive contexts, including a new data generator capable of creating heterogeneous three-way datasets with annotated triclustering solutions and benchmark datasets for comparative evaluation. It proposes a novel approach to capture time-contiguous triclusters, enhancing the search for temporal coherence. It introduces a new triclustering approach able to handle heterogeneous data by applying sequential pattern mining principles to identify relevant patterns and derive triclusters capturing temporal data dynamics. Additionally, it presents a new method for learning pattern-centric predictors. Finally, it proposes an extension and integration of principles for learning from static and temporal data structures. The developed methods were comprehensively validated in concrete real-world clinical scenarios, showing promising results concerning two progressive diseases. They were used to predict clinically relevant endpoints and identify disease-specific progression patterns, supporting medical decisions and identifying significant patient profiles.

Description

Keywords

Triclustering Pattern Discovery Pattern-centric Predictive Models Multivariate Time-series Heterogeneous Clinical Data

Pedagogical Context

Citation

Research Projects

Research ProjectShow more
Research ProjectShow more
Research ProjectShow more
Research ProjectShow more

Organizational Units

Journal Issue

Publisher

CC License