Utilize este identificador para referenciar este registo: http://hdl.handle.net/10400.5/102489
Título: Models for the probability of mortgage defaults
Autor: Dias, Mariana dos Santos Ribeiro
Orientador: Bastos, João Afonso
Palavras-chave: Mortgage Default Prediction
Credit Scoring
Imbalanced Dataset
Binary Classification Problems
Calibration
MAPIE
Data de Defesa: Fev-2025
Editora: Instituto Superior de Economia e Gestão
Citação: Dias, Mariana dos Santos Ribeiro 2025). “Models for the probability of mortgage defaults”. Dissertação de Mestrado. Universidade de Lisboa. Instituto Superior de Economia e Gestão
Resumo: Mortgage default prediction is a critical task for financial institutions, where accurately identifying high-risk borrowers is essential for mitigating financial losses and ensuring responsible lending practices. Traditional credit scoring models, such as logistic regression, are widely used but often fail to capture complex patterns in borrower behaviour, especially when the data is highly skewed. This thesis applies Logistic Regression, Random Forest, Extreme Gradient Boosting (XGBoost) and Light Gradient-Boosting Machine (LightGBM) to mortgage default data, using several feature selection techniques and data imbalance strategies. Calibration is also applied with both Platt Scaling and Isotonic Regression, and their performances are evaluated. In addition, the behaviour of Model-Agnostic Prediction Interval Estimation (MAPIE) in the context of mortgage default prediction is investigated. By leveraging MAPIE’s conformal prediction framework, this study assesses its ability to provide robust uncertainty estimates and reliable predictive intervals for default classification. The results obtained demonstrate that, for this dataset, boosting models, particularly XGBoost, outperform Logistic Regression in mortgage default prediction. Addressing class imbalance through hybrid resampling techniques was the most beneficial for the Random Forest model, while boosting methods hand class imbalance better by using built-in parameters. Isotonic Regression worked well for tree-based algorithms, while Platt Scaling was better for Logistic Regression. When using Model-Agnostic Prediction Interval Estimation (MAPIE), balancing coverage and interval width was a challenge, making it necessary to use another metric that took both into account: Exact Match Rate. These findings highlight the importance of combining advanced machine learning techniques with calibration and uncertainty quantification to improve risk assessment in financial institutions, offering a more data-driven and reliable approach to credit decisionmaking.
URI: http://hdl.handle.net/10400.5/102489
Aparece nas colecções:BISEG - Dissertações de Mestrado / Master Thesis

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
DM-MSRD-2025.pdf495 kBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.