Choosing the right model
With SAS Enterprise Miner, it is possible to create a variety of model types, such as scorecards, decision trees or neural networks. When you evaluate which model type is best suited for achieving your goals, you may want to consider criteria such as the ease of applying the model, the ease of understanding it and the ease of justifying it.
At the same time, for each particular model of whatever type, it is important to assess its predictive performance, such as the accuracy of the scores that the model assigns to the applications. A variety of business-relevant quality measures are used for this. The best model will be determined both by the purpose for which the model will be used and by the structure of the data set on which it is validated.
Scorecards
The traditional form of a credit scoring model is a scorecard. This is a table that contains a number of questions that an applicant is asked (called characteristics) and, for each question, a list of possible answers (called attributes). For example, one characteristic may be the age of the applicant, and the attributes for this characteristic are then a number of age ranges into which an applicant can fall. For each answer, the applicant receives a certain number of points—more if the attribute is one of low risk, fewer if the risk is higher. If the application’s total score exceeds a specified cut-off number of points, it is recommended for acceptance.
Such a scorecard model, apart from being a long-established method in the industry, still has several advantages when compared with more recent “data mining” types of models, such as decision trees or neural networks. To begin with, a scorecard is easy to apply.
If needed, the scorecard can be evaluated on a sheet of paper in the presence of the applicant. The scorecard is also easy to understand. The number of points for one answer doesn’t depend on any of the other answers, and across the range of possible answers for one question, the number of points usually increases in a simple way (often monotonically, or even linearly). Therefore, it is often easy to justify to the applicant a decision that is made on the basis of a scorecard. It is possible to disclose groups of characteristics where the applicant has a potential for improving the score and to do so in broad enough terms not to risk manipulated future applications.
Decision trees
On the other hand, a decision tree may outperform a scorecard in terms of predictive accuracy, because unlike the scorecard, it detects and exploits interactions between characteristics. In a decision tree model, each answer that an applicant gives determines what question is asked next. If the age of an applicant is, for example, greater than 50, the model may suggest granting a credit without any further questions, because the average bad rate of that segment of applications is sufficiently low. If, on the other extreme, the age of the applicant is below 25, the model may suggest asking about time on the job next. Then, credit might be granted only to those that have exceeded 24 months of employment, because only in that sub-segment of younger adults is the average bad rate sufficiently low.
Thus, a decision tree model consists of a set of “if … then … else” rules that are still quite straightforward to apply. The decision rules are also easy to understand, perhaps even more so than a decision rule that is based on a total score that is made up of many components. However, a decision rule from a tree model, while easy to apply and understand, may be hard to justify for applications that lie on the border between two segments. There will be cases where an applicant will, for example, say: “If I had only been two months older, I would have received credit without further questions, but now I am asked for additional securities. That is unfair.” That applicant may also be tempted to make a false statement about his or her age in the next application, or simply go elsewhere for financial services.
Even if a decision tree is not used directly for scoring, this model type still adds value in a number of ways. The identification of clearly defined segments of applicants with a particularly high or low risk can give dramatic new insight into the risk structure of the entire customer population. Decision trees are also used in scorecard monitoring, where they identify segments of applications where the scorecard underperforms.
Neural networks
With the decision tree, we could see that there is such a thing as a decision rule that is too easy to understand and thereby invites fraud. Ironically speaking, there is no danger of this happening with a neural network. Neural networks are extremely flexible models that combine characteristics in a variety of ways. Their predictive accuracy can be far superior to scorecards and they don’t suffer from sharp “splits” as decision trees sometimes do.
However, it is virtually impossible to explain or understand the score that is produced for a particular application in any simple way. It can be difficult to justify a decision that is made on the basis of a neural network model. In some countries, it may even be a legal requirement to be able to explain a decision and such a justification must then be produced with additional methods. A neural network of superior predictive power is therefore best suited for certain behavioral or collection scoring purposes, where the average accuracy of the prediction is more important than the insight into the score for each particular case. Neural network models cannot be applied manually like scorecards or simple decision trees, but require software to score the application. However, their use is just as simple as that of the other model types.
Case study
Scenario
An international financial services organization entered the consumer credit market in a large western European country two years ago. So far, it has been operating with the use of a generic scorecard for application scoring, but now has collected enough performance data to create its own custom scorecard. The company has been offering various types of consumer loans via various channels and the first custom scorecard will be applicable to applicants from all channels. Channel-specific scorecards may later be created as required.
SAS Enterprise Miner process flow
SAS Enterprise Miner software is used for building the scorecard. SAS Enterprise Miner enables the analyst to access a comprehensive collection of analytical tools through a graphical user interface. It provides a workspace onto which nodes (tool-icons) are dropped from a tools palette. Nodes are then connected to form process flow diagrams (PFDs) that structure and document the flow of analytical activities that are carried out. The SEMMA concept (Sample, Explore, Modify, Model and Assess) serves as a guideline for creating , process flows and nodes are grouped accordingly in the tools palette.
Figure 1 shows the process flow for modeling on the accepts data. All components of the flow are discussed in more detail in the sections below. The flow begins with reading in the development sample. After using the Data Partition node to split off part of the sample for later validation, the flow divides into a scorecard branch consisting of the Interactive Grouping node and Scorecard node and a decision tree branch consisting of the Decision Tree node. The quality of the scorecard and the tree are then compared on the validation data with the Model Comparison node.

Figure 1: Process flow diagram – “accepts” data.
Development sample
The development sample (input data set) is a balanced sample consisting of 1500 good and 1500 bad accepted applicants. “Bad” has been defined as having been 90 days past due once. Everyone not “bad” is “good,” so there are no “indeterminates.” A separate data set contains the data on rejects.
The modeling process, especially when the validation charts are involved, requires information about the actual good/bad proportion in the accept population. Sampling weights are used here for simulating that proportion. A weight of 30 is assigned to a good application and a weight of 1 to a bad one. Thereafter all nodes in the process flow diagram treat the sample as if it consisted of 45,000 good applications and 1,500 bad applications. Figure 3 shows the distribution of good/bad after the application of sampling weights. The bad rate is 3.23 percent.