آرشیو

آرشیو شماره‌ها:
۱۰۲

چکیده

در این پژوهش، 56,965 فقره تسهیلات اعطایی طی سال های 1398 تا 1403 در شعب شمال تهران بانک ملی ایران، به منظور برآورد احتمال نکول وام مورد بررسی قرار گرفتند. برای پیش بینی رفتار اعتباری مشتریان، سه مدل شامل رگرسیون لجستیک، جنگل تصادفی و تقویت گرادیان حداکثری به کار گرفته شده است. متغیرهای ورودی شامل 29 متغیر در سه دسته اصلی بودند: مشخصات قرارداد تسهیلات (مبلغ، دوره بازپرداخت، نوع وثیقه و...)، ویژگی های فردی تسهیلات گیرنده (سن، شغل، سابقه اعتباری و...) و مشخصات شعبه (استان، نوع شعبه و...). همچنین پیش پردازش هایی مانند حذف مقادیر پرت، دسته بندی متون، استخراج سن و دوره تنفس از داده های موجود انجام شده است و مدل ها در دو حالت پایه و بهینه سازی شده (با تنظیم ابرپارامترها) ارزیابی شدند. نتایج نشان داد که مدل های یادگیری ماشین عملکرد بهتری نسبت به روش سنتی دارند. شاخص ROC-AUC برای مدل تقویت گرادیان حداکثری معادل 73/99 و برای جنگل تصادفی نیز 68/99 درصد برآورد شد درحالی که این مقدار برای رگرسیون لجستیک تنها 34/75 درصد بود. اختلاف میانگین AUC بین مدل های یادگیری ماشین و رگرسیون لجستیک حدود 243/0 بود و در همه موارد، آزمون های آماری و فاصله اطمینان 95 درصد، بر معناداری این اختلاف تأکید داشتند. یافته ها برتری قابل اتکای روش های یادگیری ماشین در پیش بینی نکول تسهیلات را تأیید می کند.

Estimating the Probability of Loan Default in Melli Bank: A Comparative Study of Machine Learning and Econometric Approaches

The current study employed a comparative analytical framework to examine credit-default prediction. It relied on a comprehensive dataset of 56,965 loan contracts issued between 2019 and 2024 across the northern branches of Bank Melli Iran. Three modeling approaches were evaluated: traditional logistic regression and two ensemble machine learning methods—random forest (RF) and extreme gradient boosting (XGBoost). The analysis incorporated 29 predictive features categorized into three conceptual groups: loan contract characteristics (e.g., principal amount, repayment tenure, collateral type), borrower attributes (e.g., age, occupational profile, credit history), and institutional factors (e.g., branch location, branch type). Data preprocessing included outlier removal, text categorization, and the extraction of variables such as age and grace period. The models were evaluated under both baseline and optimized (hyperparameter-tuned) settings. The results showed that the machine learning models substantially outperformed the conventional logistic regression model. XGBoost delivered the highest discriminatory power (ROC-AUC = 99.73%), followed closely by RF (99.68%), whereas logistic regression lagged significantly (75.34%). On average, the AUC difference between the machine learning models and logistic regression was approximately 0.243, and statistical tests with 95% confidence intervals confirmed the significance of this gap. Overall, the findings provided strong evidence for the superior reliability of machine learning approaches in forecasting loan default. Introduction Although traditional econometric models such as logistic regression have long served as the foundation of credit scoring systems, their reliance on linearity assumptions and error independence limits their ability to capture the complex, nonlinear patterns typical of financial data. These limitations are further compounded by sensitivity to multicollinearity and distributional assumptions that are frequently inconsistent with real-world conditions. The present research aimed to address these shortcomings by conducting a rigorous comparative analysis of predictive methodologies within Iran’s banking sector—a context in which machine learning applications remain relatively underutilized despite widespread global adoption of artificial intelligence in finance. Specifically, the study intended to compare the performance of two ensemble learning techniques ( i.e., random forest and extreme gradient boosting or XGBoost), with that of conventional logistic regression in forecasting loan defaults using extensive real-world data from Bank Melli Iran. The methodological advantages of machine learning approaches arise from their ability to model complex nonlinear relationships without requiring predefined functional forms, to automatically capture variable interactions through hierarchical partitioning, to maintain robustness in the presence of outliers and non-normal distributions, and to detect subtle patterns in high-dimensional data that escape parametric detection. By systematically evaluating these capabilities, the current study tried to offer empirical evidence to support financial institutions in adopting more advanced and reliable risk modeling frameworks. Materials and Methods The selection of predictive models in this study is informed by theoretical foundations, empirical literature, and practical forecasting capabilities. Three distinct modeling approaches—random forest (RF), extreme gradient boosting (XGBoost), and logistic regression (LR)—were employed to evaluate their effectiveness in predicting loan defaults. As a widely used ensemble learning algorithm, random forest (RF) builds multiple decision trees using bootstrap aggregating and random subsets of observations and features. Each tree is trained independently, and final predictions are obtained through majority voting (classification) or averaging (regression). This structure reduces overfitting and improves generalization compared to single decision trees. XGBoost is an advanced gradient boosting algorithm known for its efficiency and high predictive accuracy. XGBoost constructs trees sequentially, with each new tree reducing the residual errors of the ensemble through gradient descent optimization. Rooted in the logistic function and formalized in modern choice modeling, logistic regression improves on linear probability models by mapping predictions to the [0,1] interval via a sigmoid transformation. Although valued for its interpretability, conventional econometric models such as logistic regression suffer from a series limitations, including linearity assumptions, limited interaction detection, multicollinearity sensitivity, and distributional constraints. These methodological constraints potentially compromise predictive performance in complex, non-linear domains such as credit risk assessment. Results and Discussion The machine learning models were evaluated under two configurations: a baseline setting using default parameters and an optimized setting using hyperparameter tuning. Hyperparameters—settings external to the model that are not learned from data—strongly influence predictive accuracy, computational efficiency, and generalization. Suboptimal hyperparameter selection can lead to underfitting or overfitting, thereby compromising model performance. Common optimization strategies include grid search, random search, and Bayesian optimization. Empirical evidence shows that random search is often more efficient in high-dimensional spaces (Bergstra & Bengio, 2012). Although default parameters may yield reasonable baseline performance, they rarely yield optimal performance (Probst et al., 2019). Prior research suggests that systematic tuning can increase accuracy by 10–20% (Hutter et al., 2019) and improve generalization (Liao et al., 2018). In this study, hyperparameters were optimized to maximize the area under the curve (AUC), a standard practice in credit risk modeling (Feurer et al., 2015). This approach can reduce prediction errors and enhance model stability in ensemble methods. The empirical results revealed substantial performance improvements through hyperparameter optimization. For the RF model, accuracy increased from 96% in the untuned configuration to 99% after tuning, with a notable reduction in false negatives and improved precision, albeit with a slight decline in recall for the default class. The optimized XGBoost model—using 375 trees, a maximum depth of 12, and a learning rate of 0.03—achieved the lowest false-negative and false-positive rates, offering an optimal balance between learning capacity and predictive accuracy. In contrast, logistic regression showed limited discriminatory power, with a recall of 0.16 and a ROC-AUC of 0.75, indicating inherent limitations in capturing the complex patterns associated with default events.   Random Forest Model (With Hyperparameter Tuning)   Random Forest Model (Without Hyperparameter Tuning)   (XGBoost) Model (With Hyperparameter Tuning)   (XGBoost) Model (Without    Hyperparameter Tuning)   Logistic Regression Model Source: Research Results Summary of Model Results Model State ACCURACY Precision (Bad) Precision (Good) Recall (Bad) Recall (Good) F1-Score (Bad) F1-Score (Good) ROC-AUC RF Unoptimized 97% 94/0 98/0 83/0 99/0 88/0 98/0 935/0 RF Optimized 99% 97/0 99/0 94/0 99/0 95/0 99/0 9968/0 XGBOOST Unoptimized 98% 96/0 99/0 85/0 99/0 90/0 99/0 9966/0 XGBOOST Optimized 99% 97/0 99/0 88/0 99/0 92/0 99/0 9973/0 LR - 96% 90/0 96/0 16/0 98/0 27/0 98/0 7534/0 Source: Research Results Conclusion The empirical results of this study demonstrates the superior predictive capabilities of machine learning methods—particularly XGBoost)—compared with conventional econometric approaches for estimating the probability of default (PD) in Bank Melli Iran’s loan portfolio. This performance gap primarily arises from machine learning algorithms’ ability to capture nonlinear relationships and latent structural patterns among default determinants—features that linear parametric models are unable to detect. Model precision was evaluated using several metrics, including confusion matrix analysis, total accuracy, and area under the ROC Curve (AUC). The findings indicated that machine learning models deliver substantially higher predictive precision and improved default detection rates. The optimized XGBoost model achieved outstanding performance (accuracy = 99%, AUC = 0.9973), far surpassing the logistic regression model’s ability to identify default cases (recall = 0.16). This distinct performance disparity strongly supports the research hypothesis regarding the comparative advantage of machine learning in PD estimation. Despite their superior predictive performance, the operational deployment of advanced machine learning techniques in financial institutions remains constrained by two key challenges: the computational complexity of hyperparameter optimization and the interpretability limitations inherent in black-box models. These limitations highlight the practical importance of developing hybrid frameworks that integrate the interpretive transparency of traditional methods with the predictive power of machine learning approaches. This research provided evidence of a paradigm shift in credit risk analytics, moving away from the long-standing reliance on conventional statistical models (such as logistic regression and linear probability models) toward machine learning methodologies. While prior studies using traditional techniques achieved moderate success, their limitations in handling imbalanced distributions and complex interaction effects have become increasingly apparent. The present findings align with international research trends and offer novel empirical evidence from Iran’s banking sector—demonstrating that well-tuned machine learning algorithms can achieve unprecedented levels of accuracy (99% accuracy compared with a 16% default identification rate for logistic regression).

تبلیغات