Logo do repositório
 
A carregar...
Miniatura
Publicação

A map of privacy sensitivity in human genomes

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
TM_Andre_Santos.pdf2.58 MBAdobe PDF Ver/Abrir

Orientador(es)

Resumo(s)

Advances in DNA sequencing technologies, particularly next-generation sequencing (NGS), have significantly increased the speed and scale of genomic data production. However, the vast amounts of generated data pose considerable privacy risks if not properly safeguarded. Genomes are unique, largely stable throughout life, and reveal sensitive information not only about individuals but also about their relatives. As such, genomic data require strong protection, yet existing measures must also preserve the high performance necessary for sequencing workflows. This thesis contributes by systematically mapping privacy-sensitive regions of the human genome and deriving location-based quantitative metrics designed to support selective, post-alignment protection strategies, strengthening privacy while helping preserve workflow efficiency. By analyzing genomic elements that have been exploited in documented privacy attacks, such as Tandem Repeats (TRs), Disease-related Genes (DGs), and Genomic Variants (GVs), and correlating them with their locations in cytobands, we construct density maps that highlight regions of higher sensitivity. These maps are then combined with re-identification and attribute disclosure attack scenarios to derive privacy sensitivity values for different genomic regions. The resulting maps and metrics provide a fine-grained view of genomic privacy risk, enabling selective protection measures to be applied only where most needed. Empirically, we find that the TR densities cluster at centromeres (≈ 70–100% vs. ≈ 1—10% elsewhere), a Y-STR surname-inference map isolates five chrY cytobands (especially q11.21) as hotspots, and an Alzheimer’s membership-inference map peaks on a few cytobands in chromosomes 6 and 19. This approach paves the way for integrating privacy sensitivity mapping into privacy-aware genomic workflows, thereby elevating privacy protection without compromising sequencing efficiency or limiting data sharing.

Descrição

Tese de Mestrado, Informática, 2025, Universidade de Lisboa, Faculdade de Ciências

Palavras-chave

Genomic data privacy Genomics Genomic data Privacy-sensitivity

Contexto Educativo

Citação

Projetos de investigação

Unidades organizacionais

Fascículo