Search results

1

Extreme gradient boosting method in the prediction of company bankruptcy

100%

Pawełek B.

Statistics in Transition new series

|

2019

|

vol. 20

|

issue 2

155-171

EN

Machine learning methods are increasingly being used to predict company bankruptcy. Comparative studies carried out on selected methods to determine their suitability for predicting company bankruptcy have demonstrated high levels of prediction accuracy for the extreme gradient boosting method in this area. This method is resistant to outliers and relieves the researcher from the burden of having to provide missing data. The aim of this study is to assess how the elimination of outliers from data sets affects the accuracy of the extreme gradient boosting method in predicting company bankruptcy. The added value of this study is demonstrated by the application of the extreme gradient boosting method in bankruptcy prediction based on data free from the outliers reported for companies which continue to operate as a going concern. The research was conducted using 64 financial ratios for the companies operating in the industrial processing sector in Poland. The research results indicate that it is possible to increase the detection rate for bankrupt companies by eliminating the outliers reported for companies which continue to operate as a going concern from data sets.

2

Credit Risk Modeling Using Interpreted XGBoost

89%

Hernes M., Adaszyński J., Tutak P.

European Management Studies

|

2023

|

vol. 21

|

issue 3

46-70

PL

Cel: celem niniejszych badań jest opracowanie modelu oceny ryzyka kredytowego z wykorzystaniem klasyfikatora XGBoost z uwzględnieniem interpretowalności tego modelu. Metodologia: w niniejszych badaniach w celu modelowania ryzyka wykorzystano metodę Extreme Gradient Boosting (XGBoost). Jest to metoda stosowana do problemów regresji i klasyfikacji. Opiera się na sekwencji drzew decyzyjnych wykorzystujących gradientową metodę optymalizacji funkcji straty w celu minimalizacji błędów słabych estymatorów. Wykorzystano również metody umożliwiające dokonanie lokalnych i globalnych interpretacji: wykresy ceteris paribus, SHAP i badanie ważności cech. Wyniki: na podstawie wyników badań można stwierdzić, że XGBoost osiągnął wyższe wartości metryk efektywności niż regresja logistyczna, z wyjątkiem wartości metryki czułości, Oznacza to, że XGBoost wskazał mniejszy odsetek wszystkich złych klientów. Wyniki interpretacji lokalnej pozwalają stwierdzić, że w przypadku klienta na decyzję kredytową pozytywnie wpływają oceny punktowe od zewnętrznych dostawców, liczba lat samochodu oraz wykształcenie wyższe, natomiast negatywnie wpływają niska zewnętrzna ocena scoringowa oraz krótki staż pracy. Taka informacja pozwala na uargumentowanie negatywnej decyzji kredytowej. Wyniki interpretacji globalnej pozwalają wnioskować, że wyższym wartościom cech związanych ze wskaźnikami towarzyszą ujemne wartości Shapleya, co można interpretować jako negatywny efekt wpływu na zmienną objaśniającą. Ograniczenia/implikacje badawcze: metody XGBoost, A ceteris paribus plot, SHAP i feature importance mogą być wykorzystane do opracowania modelu oceny ryzyka kredytowego z uwzględnieniem interpretowalności uczenia maszynowego. Głównym ograniczeniem badań jest porównanie wyników XGBoost jedynie z wynikami regresji logistycznej. Przyszłe badania powinny skupić się na porównaniu wyników XGBoost z innymi metodami uczenia maszynowego, w tym z sieciami neuronowymi Oryginalność/wartość: jednym z kluczowych procesów realizowanych w bankach, jest proces podejmowania decyzji dotyczących udzielenia kredytów, czyli ocena ryzyka spłaty zobowiązania przez klienta. W sektorze finansów konsumenckich procesy te są zwykle w dużym stopniu zautomatyzowane, a coraz częściej wykorzystuje się w tym celu najnowsze metody uczenia maszynowego oparte na sieciach neuronowych i metodach uczenia zespołowego. Choć modele uczenia maszynowego pozwalają na osiągnięcie wyższej dokładności oceny ryzyka kredytowego w porównaniu z tradycyjnymi metodami statystycznymi, to głównym problemem jest niska interpretowalność modeli uczenia maszynowego. Modele te często występują jako „black box”. Interpretacja wyników modeli oceny ryzyka jest jednak bardzo ważna ze względu na konieczność wyjaśnienia klientowi powodów oceny jego ryzyka kredytowego.

EN

Purpose: The aim of the paper is to develop a credit risk assessment model usingb the XGBoost classifier supported by interpretation issues. Design/methodology/approach: The risk modeling is based on Extreme Gradient Boosting (XGBoost) in the research. It is a method used for regression and classification problems. It is based on a sequence of decision trees using a gradient-based optimization method of the loss function to minimize the errors of weak estimators. We use also methods for performing local and global interpretability: ceteris paribus charts, SHAP and feature importance approach. Findings: Based on the research results, it can be concluded that XGBoost achieved higher values of performance metrics than logistic regression, except sensitivity. It means that XGBoost indicated a smaller percentage of all bad client. Results of local interpretability enable a conclusion that in the case of the client in question, the credit decision is positively influenced by credit scores from external suppliers, while it is negatively influenced by minimal external scoring and short seniority. The number of years in the car and higher education are also positive. Such information helps to justify a negative credit decision. Results of global interpretability enable a conclusion that higher values of the traits associated with the z-scores are accompanied by negative Shapley values, which can be interpreted as a negative effect on the explanatory variable. Research limitations/implications: XGBoost, A ceteris paribus plot, SHAP, and feature importance methods can be used to develop a credit risk assessment model including machine learning interpretability. The main limitation of research is to compare the results of XGBoost only to the logistic regression results. Future research should focus on comparing the results of XGBoost to other machine learning methods, including neural networks. Originality/value: One of the key processes in a bank is the credit decision process, which is the evaluation of a client’s repayment risk. In the consumer finance sector, the processes are usually largely automated, and increasingly the latest machine learning methods based on neural networks and ensemble learning methods are being used for the purpose. Although machine learning models allow for achieving higher accuracy of credit risk assessment compared to traditional statistical methods, the main problem is the low interpretability of machine learning models. The models often perform as the “black box”. However, the interpretation of the results of risk assessment models is very important due to the need to explain to the client the reasons for assessing their credit risk.

3

Supporting the Age-Period-Cohort model of default rate prediction with interpretable machine learning

75%

Kwiatkowski M. P.

Przegląd Statystyczny

|

2023

|

vol. 70

|

issue 1

54-78

EN

Regular short-term forecasting of defaults is a basic activity of a retail portfolio risk manager. From a business perspective, not only the quality of the forecast is significant, but also the understanding of the trends and their driving factors. The vintage analysis and a more advanced Age-Period-Cohort approach are popular tools used for the purpose. The aim of this article is to demonstrate that interpretable machine learning can support the Age-PeriodCohort approach, facilitating forecasting beyond the time range of training data, eliminating the model identification problem and attributing cohort quality to the specific characteristics of loans approved in a given month. The study is based on real consumer finance portfolios from the Polish market.

Refine search results

1 European Management Studies

1 Przegląd Statystyczny

1 Statistics in Transition new series

1 Adaszyński J.

1 Hernes M.

1 Kwiatkowski M. P.

1 Pawełek B.

1 Tutak P.

2 2023

1 2019

Extreme gradient boosting method in the prediction of company bankruptcy

Credit Risk Modeling Using Interpreted XGBoost

Supporting the Age-Period-Cohort model of default rate prediction with interpretable machine learning