Utilize este identificador para referenciar este registo:
http://hdl.handle.net/10400.5/102489
Título: | Models for the probability of mortgage defaults |
Autor: | Dias, Mariana dos Santos Ribeiro |
Orientador: | Bastos, João Afonso |
Palavras-chave: | Mortgage Default Prediction Credit Scoring Imbalanced Dataset Binary Classification Problems Calibration MAPIE |
Data de Defesa: | Fev-2025 |
Editora: | Instituto Superior de Economia e Gestão |
Citação: | Dias, Mariana dos Santos Ribeiro 2025). “Models for the probability of mortgage defaults”. Dissertação de Mestrado. Universidade de Lisboa. Instituto Superior de Economia e Gestão |
Resumo: | Mortgage default prediction is a critical task for financial institutions, where accurately identifying high-risk borrowers is essential for mitigating financial losses and ensuring responsible lending practices. Traditional credit scoring models, such as logistic regression, are widely used but often fail to capture complex patterns in borrower behaviour, especially when the data is highly skewed. This thesis applies Logistic Regression, Random Forest, Extreme Gradient Boosting (XGBoost) and Light Gradient-Boosting Machine (LightGBM) to mortgage default data, using several feature selection techniques and data imbalance strategies. Calibration is also applied with both Platt Scaling and Isotonic Regression, and their performances are evaluated. In addition, the behaviour of Model-Agnostic Prediction Interval Estimation (MAPIE) in the context of mortgage default prediction is investigated. By leveraging MAPIE’s conformal prediction framework, this study assesses its ability to provide robust uncertainty estimates and reliable predictive intervals for default classification. The results obtained demonstrate that, for this dataset, boosting models, particularly XGBoost, outperform Logistic Regression in mortgage default prediction. Addressing class imbalance through hybrid resampling techniques was the most beneficial for the Random Forest model, while boosting methods hand class imbalance better by using built-in parameters. Isotonic Regression worked well for tree-based algorithms, while Platt Scaling was better for Logistic Regression. When using Model-Agnostic Prediction Interval Estimation (MAPIE), balancing coverage and interval width was a challenge, making it necessary to use another metric that took both into account: Exact Match Rate. These findings highlight the importance of combining advanced machine learning techniques with calibration and uncertainty quantification to improve risk assessment in financial institutions, offering a more data-driven and reliable approach to credit decisionmaking. |
URI: | http://hdl.handle.net/10400.5/102489 |
Aparece nas colecções: | BISEG - Dissertações de Mestrado / Master Thesis |
Ficheiros deste registo:
Ficheiro | Descrição | Tamanho | Formato | |
---|---|---|---|---|
DM-MSRD-2025.pdf | 495 kB | Adobe PDF | Ver/Abrir |
Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.