Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank
(ندگان)پدیدآور
Sadatrasoul, Seyed MahdiHajimohammadi, Zeynabنوع مدرک
TextOriginal Article
زبان مدرک
Englishچکیده
Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau's data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to build their default prediction models. However, in practice the data records are usually incomplete and have some missing values and this make problems for banks, especially in credit risk portfolios which are low default and makes model rule based building complex. Several strategies could be used in order to handle the missing data issue. This paper used five missing value handling strategies including; ignoring, replacing with random, mean, C&R tree induced values and elimination strategies in a real credit scoring dataset. Experimental results show that ignoring strategy consistently outperforms other methods on test data set, and suggest that the CHAID is a useful classifier for handling low default portfolios with missing value.
کلید واژگان
Credit Scoringbanking industry
Rule extraction
Missing data
Low default portfolio
شماره نشریه
2تاریخ نشر
2018-12-011397-09-10
ناشر
Iran Center for Management Studiesسازمان پدید آورنده
Management school, Kharazmi University, Tehran, Iran.Shahid Beheshti University, Tehran, Iran.
شاپا
2476-308X2476-3098




