Logo do repositório
 
A carregar...
Miniatura
Publicação

Models for the probability of mortgage defaults

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
DM-MSRD-2025.pdf495 KBAdobe PDF Ver/Abrir

Resumo(s)

Mortgage default prediction is a critical task for financial institutions, where accurately identifying high-risk borrowers is essential for mitigating financial losses and ensuring responsible lending practices. Traditional credit scoring models, such as logistic regression, are widely used but often fail to capture complex patterns in borrower behaviour, especially when the data is highly skewed. This thesis applies Logistic Regression, Random Forest, Extreme Gradient Boosting (XGBoost) and Light Gradient-Boosting Machine (LightGBM) to mortgage default data, using several feature selection techniques and data imbalance strategies. Calibration is also applied with both Platt Scaling and Isotonic Regression, and their performances are evaluated. In addition, the behaviour of Model-Agnostic Prediction Interval Estimation (MAPIE) in the context of mortgage default prediction is investigated. By leveraging MAPIE’s conformal prediction framework, this study assesses its ability to provide robust uncertainty estimates and reliable predictive intervals for default classification. The results obtained demonstrate that, for this dataset, boosting models, particularly XGBoost, outperform Logistic Regression in mortgage default prediction. Addressing class imbalance through hybrid resampling techniques was the most beneficial for the Random Forest model, while boosting methods hand class imbalance better by using built-in parameters. Isotonic Regression worked well for tree-based algorithms, while Platt Scaling was better for Logistic Regression. When using Model-Agnostic Prediction Interval Estimation (MAPIE), balancing coverage and interval width was a challenge, making it necessary to use another metric that took both into account: Exact Match Rate. These findings highlight the importance of combining advanced machine learning techniques with calibration and uncertainty quantification to improve risk assessment in financial institutions, offering a more data-driven and reliable approach to credit decisionmaking.

Descrição

Palavras-chave

Mortgage Default Prediction Credit Scoring Imbalanced Dataset Binary Classification Problems Calibration MAPIE

Contexto Educativo

Citação

Dias, Mariana dos Santos Ribeiro 2025). “Models for the probability of mortgage defaults”. Dissertação de Mestrado. Universidade de Lisboa. Instituto Superior de Economia e Gestão

Projetos de investigação

Unidades organizacionais

Fascículo

Editora

Instituto Superior de Economia e Gestão

Licença CC