| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 495 KB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
Mortgage default prediction is a critical task for financial institutions, where accurately identifying high-risk borrowers is essential for mitigating financial losses and ensuring responsible lending practices. Traditional credit scoring models, such as logistic
regression, are widely used but often fail to capture complex patterns in borrower behaviour, especially when the data is highly skewed.
This thesis applies Logistic Regression, Random Forest, Extreme Gradient Boosting
(XGBoost) and Light Gradient-Boosting Machine (LightGBM) to mortgage default data,
using several feature selection techniques and data imbalance strategies. Calibration is
also applied with both Platt Scaling and Isotonic Regression, and their performances are
evaluated.
In addition, the behaviour of Model-Agnostic Prediction Interval Estimation (MAPIE)
in the context of mortgage default prediction is investigated. By leveraging MAPIE’s conformal prediction framework, this study assesses its ability to provide robust uncertainty
estimates and reliable predictive intervals for default classification.
The results obtained demonstrate that, for this dataset, boosting models, particularly
XGBoost, outperform Logistic Regression in mortgage default prediction. Addressing
class imbalance through hybrid resampling techniques was the most beneficial for the
Random Forest model, while boosting methods hand class imbalance better by using
built-in parameters. Isotonic Regression worked well for tree-based algorithms, while
Platt Scaling was better for Logistic Regression. When using Model-Agnostic Prediction Interval Estimation (MAPIE), balancing coverage and interval width was a challenge, making it necessary to use another metric that took both into account: Exact Match
Rate. These findings highlight the importance of combining advanced machine learning
techniques with calibration and uncertainty quantification to improve risk assessment in
financial institutions, offering a more data-driven and reliable approach to credit decisionmaking.
Descrição
Palavras-chave
Mortgage Default Prediction Credit Scoring Imbalanced Dataset Binary Classification Problems Calibration MAPIE
Contexto Educativo
Citação
Dias, Mariana dos Santos Ribeiro 2025). “Models for the probability of mortgage defaults”. Dissertação de Mestrado. Universidade de Lisboa. Instituto Superior de Economia e Gestão
Editora
Instituto Superior de Economia e Gestão
