|
|
Discriminant Analysis
Discriminant
Analysis is a technique used to determine which of a number of measured
variables are important in distinguishing between objects belonging to known
groups. For example a biologist could measure different morphological
characteristics (e.g. limb lengths, skull sizes etc) of a range of species and
use discriminant analysis to determine which of the measured traits are most
useful in predicting species membership. This analysis can typically have two
different objectives: 1) To identify the relative contributions of the
variables in maximally discriminating between the groups, or 2) To determine
mathematical functions based on the measured variables that can then be used to
classify new data into the original groups.
statistiXL provides modules for both grouping and classification discriminant
analysis. Both analyses provide discriminant functions that best allow for
separation of the known groups based upon the measured variables. Testing
for excessive colinearity between variables is catered for via estimates
of tolerance. In addition to this, the classification module allows for the
classification of cases with unknown group membership based on these previously
determined functions. The effectiveness of these functions is estimated by also
reclassifying the original data (i.e. that belonging to cases from known
groups) in order to determine the proportion that are correctly classified.
statistiXL also provides an improved estimate of the error rate via the holdout
method, in which each case to be classified is in turn excluded from the
dataset when calculating the discriminant functions to be used for that
particular classification.
 Results
are presented with an optional display of descriptive statistics for each
group/variable combination and the covariance matrix showing the relationships
between measured variables. Next, eigenvalues are given (indicators of the
amount of variance in the dataset encompassed in a discriminant function),
along with values for Wilk’s lambda, Chi2, degrees of freedom, and P value for
each discriminant function. Unstandardised and standardised discriminant
functions (i.e. the coefficient scores) are then tabulated, along with group
centroids. Individual case scores are provided and optional scatterplots of
casewise discriminant scores can be created for each pair-wise set of selected
discriminant functions; the scatterplots can include graphical representations
of the contributions of each variable to the discriminant functions. For
classification analysis, a classification table is given for the data set used
to derive the classification functions, indicating the proportion of correct
classifications for each group. Optionally, a classification table derived from
the holdout procedure can also be presented. The classification group scores
for an alternate data set (if entered) are then given.
The help file included with statistiXL provides an overview to discriminant
analysis, and gives two examples of grouping discriminant analysis (2 groups
and 3 groups), and two examples of classification discriminant analysis (2
groups and 3 groups).
|