Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank

Sadatrasoul, Seyed Mahdi; Hajimohammadi, Zeynab

doi:https://dx.doi.org/10.22116/jiems.2018.80681

(ندگان)پدیدآور

Sadatrasoul, Seyed MahdiHajimohammadi, Zeynab

دریافت مدرک

FullText

اندازه فایل:

618.4کیلوبایت

نوع فايل (MIME):

PDF

نوع مدرک

Text
Original Article

زبان مدرک

English

نمایش کامل رکورد

چکیده

Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau's data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to build their default prediction models. However, in practice the data records are usually incomplete and have some missing values and this make problems for banks, especially in credit risk portfolios which are low default and makes model rule based building complex. Several strategies could be used in order to handle the missing data issue. This paper used five missing value handling strategies including; ignoring, replacing with random, mean, C&R tree induced values and elimination strategies in a real credit scoring dataset. Experimental results show that ignoring strategy consistently outperforms other methods on test data set, and suggest that the CHAID is a useful classifier for handling low default portfolios with missing value.

کلید واژگان

Credit Scoring
banking industry
Rule extraction
Missing data
Low default portfolio

شماره نشریه

تاریخ نشر

2018-12-01
1397-09-10

ناشر

Iran Center for Management Studies

سازمان پدید آورنده

Management school, Kharazmi University, Tehran, Iran.
Shahid Beheshti University, Tehran, Iran.

شاپا

2476-308X
2476-3098

URI

https://dx.doi.org/10.22116/jiems.2018.80681
http://jiems.icms.ac.ir/article_80681.html
https://iranjournals.nlai.ir/handle/123456789/257829