Logo do repositório
 
Publicação

Understanding a data-based model: Insights from extreme misclassifications of machine-assisted selection of AGN

dc.contributor.authorCarrazedo, Bruno Manuel Teixeira
dc.contributor.institutionFaculty of Sciences
dc.contributor.institutionDepartment of Physics
dc.contributor.supervisorTroncoso, Israel Matute
dc.contributor.supervisorCarvajal, Rodrigo
dc.date.accessioned2026-02-16T15:05:01Z
dc.date.available2026-02-16T15:05:01Z
dc.date.issued2025
dc.descriptionTese de Mestrado, Física (Astrofísica e Cosmologia), 2025, Universidade de Lisboa, Faculdade de Ciências
dc.description.abstractActive Galactic Nuclei (AGNs) and star-forming galaxies (SFGs) are key populations for understanding galaxy evolution, yet their separation is challenging due to overlapping photometry and spectroscopy. In this work, we evaluate a supervised Machine Learning (ML) classifier that distinguishes AGNs from SFGs using multiwavelength data from Panoramic Survey Telescope and Rapid Response System Data Release 1 (Pan-STARRS DR1), Wide-field Infrared Survey Explorer (WISE), Two-Micron All-Sky Survey (2MASS) and spectroscopic labels from Sloan Digital Sky Survey Data Release 16 (SDSS-DR16) and the Hobby-Eberly Telescope Dark Energy Experiment Spring Field (HETDEX) survey. We analyze both global classification metrics and extreme misprediction cases, focusing on highconfidence predictions where the model disagrees with survey labels. Color–color diagrams and spectral inspection are used to assess whether these mispredictions reflect model limitations or misassigned survey labels. Our results show that the classifier achieves high overall accuracy, with AGNs typically predicted with > 99% confidence across most redshift intervals. Misclassifications are concentrated in the SFGs, lowering the Matthews Correlation Coefficient (MCC) in regions where SFGs are statistically negligible. Spectral analysis of extreme cases reveals that a fraction of sources labeled as AGNs but predicted as SFGs lack characteristic AGN features, supporting the model’s prediction. Conversely, some sources labeled as SFGs but predicted as AGNs exhibit broad emission lines and blue continua, consistent with AGN profiles. These findings suggest that the ML model can be used to improve survey classifications, particularly when redshift assignments are uncertain. We conclude that integrating ML with optical / Mid-Infrared (MIR) surveys enhances source classification and helps identify mislabeled objects. Future work should include feature importance analysis and expanded confidence intervals to further interpret the model’s decision-making and refine its application to large extragalactic samples.en
dc.formatapplication/pdf
dc.identifier.tid204175186
dc.identifier.urihttp://hdl.handle.net/10400.5/117107
dc.language.isoeng
dc.subjectActive galactic nuclei
dc.subjectPhotometry
dc.subjectMachine learning
dc.subjectData analysis
dc.titleUnderstanding a data-based model: Insights from extreme misclassifications of machine-assisted selection of AGNen
dc.typemaster thesis
dspace.entity.typePublication
rcaap.rightsopenAccess

Ficheiros

Principais
A mostrar 1 - 1 de 1
A carregar...
Miniatura
Nome:
TM_Bruno_Carrazedo.pdf
Tamanho:
5.82 MB
Formato:
Adobe Portable Document Format