< Back to previous page

Project

Machine Learning Methods with a Reject Option for Spot Factoring and Beyond

While the design and application of credit risk models has extensively been investigated in operations research literature, the use case of spot factoring received far less attention, despite its subtle differences with typical lending solutions and its increasing importance as a funding solution for higher-risk businesses in a post-financial-crisis, post-COVID-crisis world. Starting from this research gap, we investigate machine learning (ML) methods that predict the outcome of invoices. Motivated by the application domain, we then continue by developing generic ML techniques for classification with a reject option.

 

In the first part of this work, we extend the concept of ML-based (short-term) probability of default towards the spot factoring context. We identify a finer-grained risk estimation task related to the overdueness of invoices and propose three possible ML approaches to this task. We discuss their suitability according to task-based and profit-driven evaluation metrics when compared on a real-life spot factoring data set. Regression models and - to a lesser extent - learning-to-rank methods show overall good performance across these metrics.

 

The real-life applicability of ML methods on a spot factor’s through-the-door population remains limited due to an issue that is well-known in the credit risk community: sample selection bias. This bias occurs because the models have no knowledge about the outcome of samples rejected by the factor. We address this problem in the second part of the dissertation and suggest the solution of refraining from making a prediction on instances that are unlike those the model was trained upon. We make the link between this problem and domains such as novelty detection, classification with a reject option and reject inference. A comparison between four classifiers, equipped with the possibility for refraining from making a prediction, offers insights into their performance for the task of novelty rejection as well as ambiguity rejection in a spot-factoring context and beyond.

 

While classification with a reject option has recently gained research interest, there are only few methods that reject both ambiguity and novelty. In the final stage of this study, we therefore propose a methodology for both ambiguity and novelty rejection by combining generative class-specific models to perform subset-valued classification. We determine thresholds for each class by optimizing several metrics using ROC analysis. In this way, the method is able to indicate (inter-class-)ambiguity as well as novelty. An experimental study with several real-life data sets verifies the merits of the proposed approach.

 

Date:7 Jun 2021 →  20 Oct 2022
Keywords:Machine learning, Supervised learning, Credit scoring
Disciplines:Machine learning and decision making
Project type:PhD project