In XLMinerTM, select Classification --> Discriminant Analysis. In the dialog box that comes up, you can specify the data to be used, the input variables and the output variable.

Variables: This box lists all the variables present in the dataset. If the "First row contains headers" box is checked, the header row above the data is used to identify variable names.
Variables in input data: Select one or more variables as independent variables from the Variables box by clicking on the corresponding selection button. These variables constitute the predictor variables.
Output Variable: Select one variable as the dependent variable from the Variables box by clicking on the corresponding selection button. This is the variable being classified.
Specify "Success" class : In logistic regression the output variable has catagorical values. Eg. Let us enter a value "1" here. Then, if in a record the output variable attains a value of 1 in the training data, that is taken as success.
Specify initial cutoff probability value for success : Enter the desired value here, say 0.5. Then the class is taken to be a success if the probability is greater than this value.
Click Next and the following dialog box appears:

Calculate according to relative occurrences: The discriminant analysis procedure incorporates prior assumptions about how frequently the different classes occur. If this option is checked, it will be assumed that the probability of encountering a particular class in the large data set is the same as the frequency with which it occurs in the training data.
Use equal prior probabilities: If this option is checked, it will be assumed that all classes occur with equal probability.
Click Next and, in the following dialog box, choose the required outputs:

Canonical variate loadings: XLMinerTM produces the canonical variates for the data which is based on an orthogonal representation of the original variates. This has the effect of choosing a representation which maximizes the distance between the different groups. For a k class problem there are k-1 Canonical variates. Very often only a subset (say g) of the canonical variates is sufficient to discriminate between the classes.
Canonical Scores: The values of the variables X1, X2, ...Xg for the ith observation are known as the canonical scores for that observation. The purpose of the canonical score is to make separation between the classes as large as possible. Thus when the observations are plotted with the canonical scores as the coordinates, the observations belonging to same class are grouped together.
Score training / validation data: Check appropriate options to show the scores of training and validation data. Score Test/New Data: Select the appropriate option for applying the model to test data and / or new data as required. See the Example of Discriminant Analysis for detailed instructions on new data. Score New data in database : See the Example of Discriminant Analysis for detailed instructions on this.
Click Finish and the output will be displayed as per the inputs given in the dialogs above.
See also


雷达卡




京公网安备 11010802022788号







